Summary: | Part name comparison in ZipPackage should be case-insensitive | ||
---|---|---|---|
Product: | POI | Reporter: | Ed Beaty <ebeaty> |
Component: | XSSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 3.7-dev | ||
Target Milestone: | --- | ||
Hardware: | Macintosh | ||
OS: | All | ||
Attachments: | XLSX file that can't be opened with POI 3.7-dev |
Fixed in r965258 There were two problems with the attached file: 1. [content_types].xml vs [Content_Types].xml You are correct, the comparison of part names should be case-insensitive. 2. The file appears to use backslashes as path separators. The OPC spec tolerates backslashes in part names, see Annex A.3. I fixed POI to do the same. Yegor |
Created attachment 25777 [details] XLSX file that can't be opened with POI 3.7-dev Attempting to open the attached file fails with the following error: Exception in thread "main" org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13] at org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:147) at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:588) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:222) at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:63) The file opens normally in Microsoft Excel 2008 for Mac. If the file is saved as a Microsoft 97 file, POI opens it normally. The file is from a non-standard source (a scientific instrument that saves its output as an .xlsx file). The file contains a part name "[content_types].xml" (lower case), but the ZipPackage.getPartsImpl method expects the name "[Content_Types].xml" (mixed case). The exception is caused by a call to entry.getName().equals(ContentTypeManager.CONTENT_TYPES_PART_NAME). According to the Open Packaging Convention, "Part name equivalence is determined by comparing part names as case-insensitive ASCII strings." The bug also occurs in Windows XP.