Summary: | POI crashes with ....a BIFF8 'Workbook' entry. Is it really an excel file? | ||
---|---|---|---|
Product: | POI | Reporter: | bearbalu <bearbalu> |
Component: | POIFS | Assignee: | POI Developers List <dev> |
Status: | RESOLVED INVALID | ||
Severity: | major | CC: | bearbalu |
Priority: | P2 | ||
Version: | 3.9-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: |
This is the xlsx that crashes
This Excel crashes too.... |
Your file looks to be a password protected xlsx file, which somewhat confusingly get stored within an OLE2 structure (I'm sure Microsoft had their reasons....) See http://poi.apache.org/encryption.html for how to read them In r1534967 I've added a more helpful error message from HSSF if you give it an encrypted .xlsx file by mistake Created attachment 31452 [details]
This Excel crashes too....
As per the original issue when I call, workbook = WorkbookFactory.create(fileInputStream), I get the exception java.lang.IllegalArgumentException: The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry. Is it really an excel file?. So I was able to use the following code in most cases to get to the underlying excel. NPOIFSFileSystem fs = new NPOIFSFileSystem(stream); EncryptionInfo info = new EncryptionInfo(fs); Decryptor d = Decryptor.getInstance(info); String password = Decryptor.DEFAULT_PASSWORD; InputStream fInputStream = getEncryptedFileInputStream(xlsFile,errorMessages); if (fInputStream != null) { Workbook workbook = WorkbookFactory.create(fInputStream); fInputStream.close(); return workbook; } However, when I do this for the attached excel file, it crashes in EncryptionInfo with the Exception org.apache.poi.EncryptedDocumentException: Unsupported hash algorithm. So I am not even sure if this an encrypted file. If I open the Excel and just re-save it, WorkbookFactory.create(fileInputStream) works fine. Attachment 31452 [details] is not a regular .xls file. It appears to be a password protected .xlsx file, which must be opened as per http://poi.apache.org/encryption.html#XML-based+formats+-+Decryption (along with the password of course!) Also, don't forget that Apache Tika is very good at working out what files are, if you ask Apache Tika to detect the file with no file extension, it correctly identifies the type as application/x-tika-ooxml-protected As for hash alg problems, either try with the latest trunk, or raise a new bug if you really have got a file that uses a format we don't support |
Created attachment 30955 [details] This is the xlsx that crashes I have an xlsx file which I can open perfectly fine using Excel 2010. However, POI crashes with the following message. If I open the file and re-save it, the crash goes away. See the attached xlsx file. Interestingly, HSSF (not XSSF) is invoked by the WorkbookFactory. Additional clues to reproducing the issue: 1. Started with an xls file -> NO crash -> I can e-mail this to someone -> 1.7 MB, can't upload it, and can't upload multiple attachments. 2. I saved it once as xls (no content changed) -> NO crash 3. I "Save As" xlsx (no content changed)-> CRASHES(attached) 4. If I open the xlsx and save it again (no content changes) -> NO crash. If I skip step 2, the crash does NOT happen. ========================================================================== java.lang.IllegalArgumentException: The supplied POIFSFileSystem does not contain a BIFF8 'Workbook' entry. Is it really an excel file? at org.apache.poi.hssf.usermodel.HSSFWorkbook.getWorkbookDirEntryName(HSSFWorkbook.java:222) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:263) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:243) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:187) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:322) at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:303) at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:70)