Bug 62187

Summary: Compiling with Java 10 fails with ClassCastException: org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream cannot be cast to java.base/java.util.zip.ZipFile$ZipFileInputStream
Product: POI Reporter: Dominik Stadler <dominik.stadler>
Component: POI OverallAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: critical CC: bodewig, snag
Priority: P2    
Version: 4.0.x-dev   
Target Milestone: ---   
Hardware: All   
OS: All   
Bug Depends on:    
Bug Blocks: 61572    
Attachments: Initial patch for using commons compress - but zip bomb handling is currently disabled. This also includes a fix for EncryptedTempData which has a wrong padding.
Patch with upcoming commons compress and zip bomb handling
remaining changes to use commons compress

Description Dominik Stadler 2018-03-17 21:03:08 UTC
When compiling Apache POI with current Java 10 pre-releases, there are tests failing:

     [java] Caused by: java.lang.ClassCastException: org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream cannot be cast to java.base/java.util.zip.ZipFile$ZipFileInputStream
     [java] 	at java.base/java.util.zip.ZipFile$ZipFileInflaterInputStream.available(ZipFile.java:478)
     [java] 	at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.available(ZipSecureFile.java:317)
     [java] 	at org.apache.poi.openxml4j.opc.internal.marshallers.ZipPartMarshaller.marshall(ZipPartMarshaller.java:85)
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackagePart.save(ZipPackagePart.java:124)
     [java] 	at org.apache.poi.openxml4j.opc.internal.marshallers.DefaultMarshaller.marshall(DefaultMarshaller.java:43)
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:585)
     [java] 	... 39 more
     [java] 36) testAddPivotTableToWorkbookWithLoadedPivotTable(org.apache.poi.xssf.usermodel.TestXSSFWorkbook)
     [java] org.apache.poi.openxml4j.exceptions.OpenXML4JRuntimeException: Fail to save: an error occurs while saving the package : org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream cannot be cast to java.base/java.util.zip.ZipFile$ZipFileInputStream
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:597)
     [java] 	at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1527)
     [java] 	at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1510)
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackage.closeImpl(ZipPackage.java:450)
     [java] 	at org.apache.poi.openxml4j.opc.OPCPackage.close(OPCPackage.java:470)
     [java] 	at org.apache.poi.POIXMLDocument.close(POIXMLDocument.java:188)
     [java] 	at org.apache.poi.xssf.usermodel.XSSFWorkbook.close(XSSFWorkbook.java:591)
     [java] 	at org.apache.poi.xssf.usermodel.TestXSSFWorkbook.$closeResource(TestXSSFWorkbook.java:198)
     [java] 	at org.apache.poi.xssf.usermodel.TestXSSFWorkbook.testAddPivotTableToWorkbookWithLoadedPivotTable(TestXSSFWorkbook.java:804)
...
     [java] 	at org.apache.poi.util.OOXMLLite.build(OOXMLLite.java:149)
     [java] 	at org.apache.poi.util.OOXMLLite.main(OOXMLLite.java:94)
     [java] Caused by: java.lang.ClassCastException: org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream cannot be cast to java.base/java.util.zip.ZipFile$ZipFileInputStream
     [java] 	at java.base/java.util.zip.ZipFile$ZipFileInflaterInputStream.available(ZipFile.java:478)
     [java] 	at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.available(ZipSecureFile.java:317)
     [java] 	at org.apache.poi.openxml4j.opc.internal.marshallers.ZipPartMarshaller.marshall(ZipPartMarshaller.java:85)
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackagePart.save(ZipPackagePart.java:124)
     [java] 	at org.apache.poi.openxml4j.opc.internal.marshallers.DefaultMarshaller.marshall(DefaultMarshaller.java:43)
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:585)
     [java] 	... 39 more

The tests fail when the ThresholdInputStream injects itself into the ZipFile because JDK 10 now expects it's own classes to be in place, not ours, e.g. in available(). Seems we need to do the Zip-Bomb detection differently in the future, however I could not immediately see a way how this can be done here.

See http://hg.openjdk.java.net/jdk/jdk10/rev/85ea7e83af30#l5.66 for the actual change. 

See https://builds.apache.org/view/P/view/POI/job/POI-DSL-1.10/ for current build-results.

Summary of discussion on the mailing-list:
---------------------
pj.fanning via poi.apache.org:
I'm also wondering if maybe we could abandon the reflection approach and just
have ThresholdInputStream wrap the entry's InputStream and count the bytes
that are read, and blow up when the thresholds are breeched. We might lose
out on some cases but the code would be easier to maintain.

Andreas Beeker:
this would potentially only work for stream but not for file based access.
---------------------
We need to keep in mind that the ThresholdInputStream was introduced to mitigate possible Zip-Bomb vulnerabilities when handling small zip-files which require  huge amounts of memory when they are unpacked. This mitigation needs to still be active with any new way of implementing this.
Comment 1 Nick Burch 2018-03-18 15:22:14 UTC
We can't be the only project wanting to do zipbomb-proof processing using the JDK provided Zip classes. Maybe worth a quick post to the openjdk 10 list to ask how they expect people doing it the old way to not loose security when upgrading?
Comment 3 Andreas Beeker 2018-04-08 17:41:38 UTC
Created attachment 35847 [details]
Initial patch for using commons compress - but zip bomb handling is currently disabled. This also includes a fix for EncryptedTempData which has a wrong padding.
Comment 4 Stefan Bodewig 2018-04-11 15:00:12 UTC
Without looking into the patch in detail: as ZipArchiveEntry extends ZipEntry you may get away with fewer changes to the public API.
Comment 5 Andreas Beeker 2018-04-24 22:13:36 UTC
Created attachment 35890 [details]
Patch with upcoming commons compress and zip bomb handling

Patch with zip bomb handling

I'll try to commit the parts which aren't commons compress related before, i.e. the patch should get a bit smaller then
Comment 6 Andreas Beeker 2018-04-25 10:05:37 UTC
apply commons compress unrelated changes via r1830061
Comment 7 Andreas Beeker 2018-04-25 11:08:08 UTC
Created attachment 35891 [details]
remaining changes to use commons compress
Comment 8 Dominik Stadler 2018-04-25 14:14:20 UTC
Did not take a closer look yet, but Maven does not have 1.17 of commons-compress yet, only 1.16.1, so the build with the patch applied fails currently...

And with 1.16.1 it fails with 
C:\workspaces\poi\src\ooxml\java\org\apache\poi\openxml4j\util\ZipArchiveThresholdInputStream.java:30: error: cannot find symbol

So we have to wait until commons-compress is released before committing.
Comment 9 Stefan Bodewig 2018-04-25 15:19:02 UTC
This is correct, there are a few issues scheduled for 1.17 that need to be resolved: https://issues.apache.org/jira/browse/COMPRESS-449?jql=project%20%3D%20COMPRESS%20AND%20fixVersion%20%3D%201.17 - my guess is it will take a few weeks until we'll have the release. In particular since we are still hashing out the API for one of them.
Comment 10 Dominik Stadler 2018-04-30 08:51:30 UTC
With latest code and probably newer Java 10 build the test-failure looks different:

     [java] 10) bug55791b(org.apache.poi.xslf.TestXSLFBugs)
     [java] java.lang.RuntimeException: java.io.EOFException: Unexpected end of ZLIB input stream
     [java] 	at org.apache.poi.xslf.XSLFTestDataSamples.openSampleDocument(XSLFTestDataSamples.java:38)
     [java] 	at org.apache.poi.xslf.TestXSLFBugs.bug55791b(TestXSLFBugs.java:498)
     [java] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     [java] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
     [java] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     [java] 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
     [java] 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
     [java] 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
     [java] 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
     [java] 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
     [java] 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
     [java] 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
     [java] 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
     [java] 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
     [java] 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
     [java] 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
     [java] 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
     [java] 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
     [java] 	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
     [java] 	at org.junit.runners.Suite.runChild(Suite.java:128)
     [java] 	at org.junit.runners.Suite.runChild(Suite.java:27)
     [java] 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
     [java] 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
     [java] 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
     [java] 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
     [java] 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
     [java] 	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
     [java] 	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
     [java] 	at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
     [java] 	at org.junit.runner.JUnitCore.run(JUnitCore.java:105)
     [java] 	at org.junit.runner.JUnitCore.run(JUnitCore.java:94)
     [java] 	at org.apache.poi.util.OOXMLLite.build(OOXMLLite.java:186)
     [java] 	at org.apache.poi.util.OOXMLLite.main(OOXMLLite.java:104)
     [java] Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
     [java] 	at java.base/java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:245)
     [java] 	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:159)
     [java] 	at java.base/java.util.zip.ZipInputStream.read(ZipInputStream.java:195)
     [java] 	at org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.read(ZipArchiveThresholdInputStream.java:116)
     [java] 	at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:157)
     [java] 	at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:120)
     [java] 	at org.apache.poi.util.IOUtils.toByteArray(IOUtils.java:107)
     [java] 	at org.apache.poi.openxml4j.util.ZipArchiveFakeEntry.<init>(ZipArchiveFakeEntry.java:47)
     [java] 	at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:51)
     [java] 	at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:106)
     [java] 	at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:304)
     [java] 	at org.apache.poi.xslf.XSLFTestDataSamples.openSampleDocument(XSLFTestDataSamples.java:36)
     [java] 	... 32 more
Comment 11 Andreas Beeker 2018-04-30 08:55:24 UTC
I had to modify the EOF handling for commons compress a bit and this was already in the committed code which prepares the commons compress integration. So that's probably the reason for the changed stacktrace.
Comment 12 Andreas Beeker 2018-06-03 22:10:36 UTC
Applied with r1832789
waiting for all Jenkins jobs (except api-check) to be successful
Comment 13 Dominik Stadler 2018-06-28 13:48:07 UTC
All relevant Jenkins-jobs run stable now, so I think this can be closed. 

Thanks for the work, Andi!
Comment 14 Javed 2018-09-11 18:37:03 UTC
Hello, first of all i want to thank you all for your work.

I still have a problem, now the new version poi 4 is already released, but i am able to run my java-code on Java 10. I am getting a huge error, starting with:

Exception in thread "main" java.lang.NoClassDefFoundError: org/w3c/dom/ls/DocumentLS
	at java.base/java.lang.ClassLoader.defineClass1(Native Method)

I also tried to get along with the patch, but I still got an error -- So if you really managed to run the poi on java 10, could someone of you please, upload the jar files for me.

Thank you a lot.
Comment 15 Greg Woolsey 2018-09-11 19:26:51 UTC
(In reply to Javed from comment #14)
> Hello, first of all i want to thank you all for your work.
> 
> I still have a problem, now the new version poi 4 is already released, but i
> am able to run my java-code on Java 10. I am getting a huge error, starting
> with:
> 
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/w3c/dom/ls/DocumentLS
> 	at java.base/java.lang.ClassLoader.defineClass1(Native Method)
> 
> I also tried to get along with the patch, but I still got an error -- So if
> you really managed to run the poi on java 10, could someone of you please,
> upload the jar files for me.
> 
> Thank you a lot.

This is unrelated to this issue.  That class is only found in older Xerces and GWT-Dev jars [1], not any of the POI 4.0.0 dependencies or Java 8+. For example, it was a core part of the JVM 15 years ago, per this bug reported as fixed in Java 1.2 [2].

Check your classpath.

[1] https://www.findjar.com/index.x;jsessionid=162736CF6358BCC23C85D4EBFD56D7D4?query=org.w3c.dom.ls.documentls
[2] https://bugs.openjdk.java.net/browse/JDK-4827955
Comment 16 Greg Woolsey 2019-03-19 15:09:02 UTC
*** Bug 63270 has been marked as a duplicate of this bug. ***