HSMF currently gets code page information from one of three sources: the properties, the headers and/or the htmlbody (if it exists). Let's refactor guess7BitEncoding() and enable clients to get this information. Patch on way.
Created attachment 31305 [details] First version attached I noticed that there's a check for "utf-8" (and if it is "utf-8" ignore it) in the headers extraction component. Do we want to add that to the codepoint and html extraction chunks, too?