Bug 59074

Summary: IllegalArgumentException: No supported documents found in the OLE2 stream
Product: POI Reporter: Christian Schroeder <cs>
Component: POIFSAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 3.13-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Excel file that causes the exception

Description Christian Schroeder 2016-02-26 11:18:15 UTC
Created attachment 33597 [details]
Excel file that causes the exception

I get an IllegalArgumentException with the message "No supported documents found in the OLE2 stream" when I try to open the attached Excel file. I can open the file in Excel 2010 and Excel 2013.
This might be a duplicate of #55614, but in my example, unix "file" tool reports "Microsoft Office Document".
Comment 1 Nick Burch 2016-02-26 11:34:14 UTC
How are you triggering the error? I've just tried opening the file with POI, and it loaded fine as an Excel file without error

Ideally, if you could produce a junit test that uses your file to trigger the error, we can use that to both diagnose the issue, and ensure it's fixed!
Comment 2 Christian Schroeder 2016-02-26 17:09:09 UTC
Sorry, maybe I chose the wrong component. This issue is related to the ExtractorFactory.
I can reproduce the error using the following line of code:

POITextExtractor poiExtractor = ExtractorFactory.createExtractor(new FileInputStream("example.xls"));
Comment 3 Nick Burch 2016-02-26 19:05:30 UTC
FYI Most people who used to use ExtractorFactory would be much better off switching to Apache Tika these days!
Comment 4 Nick Burch 2016-02-26 23:59:07 UTC
Good news - as of r1732587 you won't get this exception!

Mixed news - you'll still get an exception... OldExcelFormatException

So, you can either catch that and call out to OldExcelExtractor directly (ExtractorFactory can't return it as it doesn't have the right parent class, as it can be OLE2 or raw), or do what most people have long since done - give up with ExtractorFactory and switch to Apache Tika. Tika can transparently pick the right extractor for you, and returns Plain Text or XHTML for this file without issue.
Comment 5 Dominik Stadler 2016-03-03 21:35:50 UTC
Why the change to SXSSF? This is clearly not related to SXSSF in any way!