Bug 58214

Summary: Error with outlook while opening an msg files extracted by POI from another msg file
Product: POI Reporter: Alexandre <alexandre.brun>
Component: HSMFAssignee: POI Developers List <dev>
Status: NEEDINFO ---    
Severity: normal    
Priority: P2    
Version: 3.13-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: The master mail with attachment
Embedded outlook msg extracted by POI
Embedded outlook msg saved by Outlook
My simple java code
Attachment from outlook, POIFSLister result
Attachment from POI, POIFSLister result
Attachment from outlook, HSMFDump result
Attachment from POI, HSMFDump result

Description Alexandre 2015-08-05 12:56:22 UTC
Created attachment 32960 [details]
The master mail with attachment

As describe in this bug : https://bz.apache.org/bugzilla/show_bug.cgi?id=58211 

I have an Outlook message (Test mail attachment) in attachment of another Outlook message (Master mail.msg). 
The msg file saved using POI (Test mail attachment_from POI.msg) generate an error when I try to open it in Outlook. Outlook does not recognize it.

I join the attachment msg file saved by outlook (Test mail attachment_from outlook.msg). 

I also join my code in case. (ExtractMsg.java)
Comment 1 Alexandre 2015-08-05 12:57:14 UTC
Created attachment 32961 [details]
Embedded outlook msg extracted by POI
Comment 2 Alexandre 2015-08-05 12:58:33 UTC
Created attachment 32962 [details]
Embedded outlook msg saved by Outlook
Comment 3 Alexandre 2015-08-05 12:58:58 UTC
Created attachment 32963 [details]
My simple java code
Comment 4 Nick Burch 2015-08-05 14:35:57 UTC
Thanks for those

Any chance you could try running dev tools like org.apache.poi.poifs.dev.POIFSLister and org.apache.poi.hsmf.dev.HSMFDump against the two extracted files, and see if there are any obvious differences between them? Sections in one not the other, different IDs, that sort of thing. That should help us narrow in on what to change
Comment 5 Alexandre 2015-08-05 14:53:46 UTC
Created attachment 32966 [details]
Attachment from outlook, POIFSLister result
Comment 6 Alexandre 2015-08-05 14:54:15 UTC
Created attachment 32967 [details]
Attachment from POI, POIFSLister result
Comment 7 Alexandre 2015-08-05 14:54:53 UTC
Created attachment 32968 [details]
Attachment from outlook, HSMFDump  result
Comment 8 Alexandre 2015-08-05 14:55:13 UTC
Created attachment 32969 [details]
Attachment from POI, HSMFDump  result
Comment 9 Alexandre 2015-08-05 14:57:45 UTC
There are differences between two extracted files using both tools. 

Hard for me to interpret differences.
Comment 10 Alexandre 2015-08-31 14:39:53 UTC
Hi, 

do you have any update about this issue ?
Comment 11 Nick Burch 2015-09-01 10:54:03 UTC
If you're able to identify the differences, we might be able to make a quick fix, or otherwise guide you on making the fix yourself

Otherwise, this issue will remain until someone volunteers to spend their free time looking at it, be they a committer or just someone else interested from within the community
Comment 12 Alexandre 2015-09-02 07:44:26 UTC
I see. Unfortunately  I'm not able to identify the differences yet. I will hope for a interested volunteers.
Thanks.
Comment 13 Tim Allison 2015-09-02 12:19:02 UTC
Not that this is any consolation/help, but it looks like POI (at least via Tika) is able to read the contents of the embedded document, both from the outer container document and from the version that you attached as extracted by POI.