Summary: | Excel 2007 file is unusable after closing Workbook object | ||
---|---|---|---|
Product: | POI | Reporter: | Armen Vardanyan <vardarmo> |
Component: | XSSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | blocker | CC: | pal.ratikanta, rober_20_02, vardarmo |
Priority: | P1 | ||
Version: | 3.11-dev | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: | excel files and java source file |
I can reproduce the problem with your small test file No idea why it's happening, hopefully someone else can investigate... The problem is that the original xlsx does not contain a file for the SharedString table in the zip-file and thus an empty string table is created during loading the file initially. During close(), the file is written back and a 0-byte sharedStrings.xml file is created, which later fails during loading the xlsx again. I tried it with the following change, which makes this test work, however I am not sure if this is the correct way to exclude parts which do not have any size: --- a/src/ooxml/java/org/apache/poi/openxml4j/opc/internal/marshallers/ZipPartMarshaller.java +++ b/src/ooxml/java/org/apache/poi/openxml4j/opc/internal/marshallers/ZipPartMarshaller.java @@ -63,6 +63,11 @@ public final class ZipPartMarshaller implements PartMarshaller { // Normally should happen only in developement phase, so just throw // exception } + + // check if there is anything to save + if(part.getSize() == 0) { + return true; + } ZipOutputStream zos = (ZipOutputStream) os; ZipEntry partEntry = new ZipEntry(ZipHelper What xbean version you are using? I can reproduce the problem in 3.11 version with xmlbeans 2.6.0. BTW, a possible workaround if you are just reading from the file is to open the file in "read-only" mode, then the problem does not happen: Workbook workbook = WorkbookFactory.create(OPCPackage.open(file, PackageAccess.READ)); (In reply to Dominik Stadler from comment #5) > BTW, a possible workaround if you are just reading from the file is to open > the file in "read-only" mode, then the problem does not happen: > > Workbook workbook = WorkbookFactory.create(OPCPackage.open(file, > PackageAccess.READ)); I get below exception when I open empty file in read-noly mode: org.apache.poi.POIXMLException: org.apache.poi.openxml4j.exceptions.InvalidOperationException: Operation not allowed, document open in read only mode! at org.apache.poi.POIXMLDocumentPart.createRelationship(POIXMLDocumentPart.java:394) at org.apache.poi.POIXMLDocumentPart.createRelationship(POIXMLDocumentPart.java:354) at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:341) at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:166) at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:240) at com.X.Y.db.process.SourceProcessorA.process(SourceProcessorA.java:42) at com.X.Y.db.process.SourceProcessorATest.testProcessEmptySource(SourceProcessorATest.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.junit.runner.JUnitCore.run(JUnitCore.java:157) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) Caused by: org.apache.poi.openxml4j.exceptions.InvalidOperationException: Operation not allowed, document open in read only mode! at org.apache.poi.openxml4j.opc.OPCPackage.throwExceptionIfReadOnly(OPCPackage.java:512) at org.apache.poi.openxml4j.opc.OPCPackage.createPart(OPCPackage.java:773) at org.apache.poi.openxml4j.opc.OPCPackage.createPart(OPCPackage.java:749) at org.apache.poi.POIXMLDocumentPart.createRelationship(POIXMLDocumentPart.java:374) Can you try with POI 3.12 beta 1? There was a read-only fix in that (In reply to Nick Burch from comment #7) > Can you try with POI 3.12 beta 1? There was a read-only fix in that Works (if you don't need write)!! Update to POI 3.12 beta 1 and you open files in read only mode. This is now fixed for empty shared string tables via r1710521, I put in a check to only avoid writing the XML file for SharedStringTable for now as doing it for all types of documents likely introduced trouble with existing code and broke unit tests. Please report new bugs if there are any other XML-parts that cause trouble if written empty. |
Created attachment 31999 [details] excel files and java source file I am reading an Excel 2007 file(xlsx format). After the first read, when I close the Workbook object, the second time I run the Java application, I get exception in Eclipse. I am using the latest release of POI available in maven repositories, i.e. 3.11 beta2. Tested on both Windows and Linux, Java 7 and Java 8, the problem persists in all cases This happens ONLY on Microsoft Excel 2007 XLSX files. It does not happen when using Microsoft Excel 2013 XLSX files. /////////////////////////////////////////////////////////////////// The error I am getting in Eclipse console is the following: Exception in thread "main" org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:62) at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:427) at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:162) at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:236) at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:109) at Main.main(Main.java:17) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60) ... 5 more Caused by: java.io.IOException: error: Unexpected end of file after null at org.apache.poi.xssf.model.SharedStringsTable.readFrom(SharedStringsTable.java:129) at org.apache.poi.xssf.model.SharedStringsTable.<init>(SharedStringsTable.java:106) ... 10 more ////////////////////////////////////////////////////////////////////////// /////////////////////////////////////////////////////////////////////// The error I am getting from Microsoft Excel 2007 when opening the file after this corruption, is the following: "Excel found unreadable content in 'file_name.xlsx'. Do you want to recover the contents of this workbook?" When answering yes, it says: "Excel was able to open the file by repairing or removing the unreadable content. Removed Part: /xl/sharedStrings.xml part with XML error. (Strings) A document must contain exactly one root element. Line 1, column 0." /////////////////////////////////////////////////////////////////////////// I also attach 1. the Java source file I used to read the file(Main.java), and 2. the Excel 2007 XLSX file before corruption(Excel_2007_file_before.xlsx) and after corruption(Excel_2007_file_after.xlsx) The files are in 'all.zip' file