Created attachment 36597 [details] Example MSG files with different code pages Some E-Mails run into encoding problems when reading the subject, text body or html body and using MAPIMessage.guess7BitEncoding. Example: E-Mail defines PR_INTERNET_CPID -> UTF-8, PR_MESSAGE_LOCALE_ID -> 1031, PR_MESSAGE_CODEPAGE -> undefined, no headers. * Outlook wants PR_SUBJECT to be CP1252 (as PR_INTERNET_CPID is only for PR_BODY and PR_BODY_HTML; currently read as UTF-8 as guess7BitEncoding sets this) * Outlook wants binary PR_BODY_HTML to be UTF-8 (Would currently read as CP1252, as getBodyHtml does not take care of any code page in case it is binary) * Outlook wants ASCII PR_BODY_HTML to be UTF-8 (Currently correct) * Outlook wants PR_BODY to be CP1252 for an unknown reason (Would currently read as UTF-8, as guess7BitEncoding sets this) In the docs PR_INTERNET_CPID may only be used to indicate the code page for PR_BODY and PR_BODY_HTML: https://docs.microsoft.com/en-us/office/client-developer/outlook/mapi/pidtaginternetcodepage-canonical-property In my tests Outlook never looks at the charset information inside the HTML; it only relies on PR_INTERNET_CPID. In case of PR_MESSAGE_CODEPAGE is undefined, and no headers are present, using the default ANSI codepage for the locale defined by PR_MESSAGE_LOCALE_ID may be the only hint to get the correct code page, as PR_INTERNET_CPID is only for text/html body. Suggestion: https://github.com/apache/poi/pull/149 (With this patch all existing Unit-Tests succeed without modification) Attachments: MSG-Files where the text body and html body should be decoded correctly. Outlook displays them as expected. Regards, Dominik