|Summary:||Part name comparison in ZipPackage should be case-insensitive|
|Product:||POI||Reporter:||Ed Beaty <ebeaty>|
|Component:||XSSF||Assignee:||POI Developers List <dev>|
|Attachments:||XLSX file that can't be opened with POI 3.7-dev|
Description Ed Beaty 2010-07-16 18:53:24 UTC
Created attachment 25777 [details] XLSX file that can't be opened with POI 3.7-dev Attempting to open the attached file fails with the following error: Exception in thread "main" org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13] at org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:147) at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:588) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:222) at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:63) The file opens normally in Microsoft Excel 2008 for Mac. If the file is saved as a Microsoft 97 file, POI opens it normally. The file is from a non-standard source (a scientific instrument that saves its output as an .xlsx file). The file contains a part name "[content_types].xml" (lower case), but the ZipPackage.getPartsImpl method expects the name "[Content_Types].xml" (mixed case). The exception is caused by a call to entry.getName().equals(ContentTypeManager.CONTENT_TYPES_PART_NAME). According to the Open Packaging Convention, "Part name equivalence is determined by comparing part names as case-insensitive ASCII strings." The bug also occurs in Windows XP.
Comment 1 Yegor Kozlov 2010-07-18 12:14:22 UTC
Fixed in r965258 There were two problems with the attached file: 1. [content_types].xml vs [Content_Types].xml You are correct, the comparison of part names should be case-insensitive. 2. The file appears to use backslashes as path separators. The OPC spec tolerates backslashes in part names, see Annex A.3. I fixed POI to do the same. Yegor