Summary: | XSSFWorkbook constructor doesn't close ZipFile if an exception occurs | ||
---|---|---|---|
Product: | POI | Reporter: | radistao <radistao> |
Component: | XSSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 4.1.1-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: | Corrupted XLSX file with malformed _rels/.rels XML |
Description
radistao
2020-01-02 18:16:21 UTC
I added a few changes: https://github.com/apache/poi/commit/22266e209a8f13a1ee88bafecb80e005303b165d Hopefully this build will succeed and new poi jars should be available to test with. Please suggest any other changes you think are useful. Reviewing the POI sources i found some places which could be considered in the fix: 1) `PackageHelper#open(InputStream)` (and maybe also `clone()`) is used directly in `super` and `this` constructor calls for XmlVisioDocument, XMLSlideShow, XWPFDocument, XSSFWorkbook 2) same for: - POIXMLDocument#openPackage() - XSSFWorkbook#newPackage() - OPCPackage: - open(XXX) - openOrCreate(XXX) - create(XXXX) It seems there are a lot of places where `OPCPackage` (or other Closeable local instances) are wrapped/created and passed as arguments to init methods and constructor, but not properly handled in case of exceptions. I fixed a few additional cases where file-handles could leak via r1872287, mostly in test-code itself, but also one case in EMF handling. We use file-leak-detector to find any leak that is triggered when running unit-tests. So a good way to help poi-devs would be to provide patches for unit-tests which actually trigger the remaining cases. This way we ensure these are caught automatically in the future. As for the code-locations that you list as still causing file-handle leaks: I could not find any actual possible way to cause a leak in a quick check for any of these, can you point out in more detail how you think those can leak file-handles? Unfortunately i'll be able to continue only next Tuesday. I'll post the update @dominik.stadler and @fanningpj I tested current sources (https://github.com/apache/poi/commit/4aa8334e3b695cc34a2bfb035b9cc35dc9271e73) on my project - can't reproduce "Cleaning up unclosed ZipFile for archive" any more! Thank you. > So a good way to help poi-devs would be to provide patches for unit-tests which actually trigger the remaining cases. I'd like to do this, but the solution and the test cases are quite huge, so hard to do this without a good introduction to the sources from actual developers. > can you point out in more detail how you think those can leak file-handles? That weren't real cases, just my assumption based on the sources, but as i mentioned above - i don't see the entire picture. BUT! When i run the gradle tests locally for ooxml subroject (`gradle --info --console=verbose :ooxml:clean :ooxml:test`) I see 2 another stderr traces about unclosed ZipFile: 1) >org.apache.poi.xddf.usermodel.chart.TestXDDFChartRemoveSeries > testRemoveSeries0 STANDARD_OUT > testRemoveSeries0: test file written to /home/andrii/workspace/poi-report/poi/build/ooxml/build/custom-reports-test/testRemoveSeries0.xlsx >org.apache.poi.openxml4j.opc.internal.TestContentTypeManager > testContentTypeRemovalPackage SKIPPED >org.apache.poi.openxml4j.opc.TestPackage > removePartRecursive SKIPPED >org.apache.poi.util.TestTempFileThreaded > runTest STANDARD_ERROR > Cleaning up unclosed ZipFile for archive /home/andrii/workspace/poi-report/poi/test-data/openxml4j/TestPackageCommon.docx >org.apache.poi.util.TestTempFileThreaded > runTest STANDARD_OUT > thread: 7, iter: 0: /tmp/poifiles/TestTempFileThreaded/26/TestTempFile-7-0-5000898654042792707.xlsx > thread: 0, iter: 0: /tmp/poifiles/TestTempFileThreaded/19/TestTempFile-0-0-9041124478588812993.xlsx > thread: 5, iter: 0: /tmp/poifiles/TestTempFileThreaded/24/TestTempFile-5-0-4091795158149176089.xlsx 2) >org.apache.poi.xssf.streaming.TestSXSSFCell > setCellFormula_isExceptionSafe_onBlankCell SKIPPED >org.apache.poi.xssf.streaming.TestSXSSFCell > setBlank_throwsISE_ifCellIsPartOfAnArrayFormulaGroupContainingOtherCells SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > test54764WithSAXHelper STANDARD_ERROR > [Fatal Error] :1:1: JAXP00010001: The parser has encountered more than "1" entity expansions in this document; this is the limit imposed by the JDK. >org.apache.poi.xssf.usermodel.TestXSSFBugs > bug48877 STANDARD_ERROR > Cleaning up unclosed ZipFile for archive /home/andrii/workspace/poi-report/poi/test-data/spreadsheet/54764.xlsx >org.apache.poi.xssf.usermodel.TestXSSFBugs > test55273 SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > test57699 SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > test58043 SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > testSetRGBBackgroundColor SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > testBug53798XLSXStream SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > testBug56688_3 STANDARD_ERROR > Cleaning up unclosed ZipFile for archive /home/andrii/workspace/poi-report/poi/test-data/spreadsheet/54764.xlsx >org.apache.poi.xssf.usermodel.TestXSSFBugs > bug46670 SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > bug59393_commentsCanHaveSameAnchor SKIPPED >org.apache.poi.xssf.usermodel.TestXSSFBugs > test57929 SKIPPED **Do you think these could be some undiscovered cases?** Another thing which confuses me a bit: Considering these changes in `XSSWorkbook#newPackage(XSSFWorkbookType)`: - https://github.com/apache/poi/compare/adb8424bc1a1c9a502d2cd07757615b711d32c50..4aa8334e3b695cc34a2bfb035b9cc35dc9271e73#diff-512a44831b35fadfdff4e2cd0bb33374 - https://github.com/apache/poi/blob/fb8f17190064f39f8f35700aa6ba1fbc2cfd88e1/src/ooxml/java/org/apache/poi/xssf/usermodel/XSSFWorkbook.java#L495-L495 now the implementation looks diverse from similar one in `XWPFDocument#newPackage()`: https://github.com/apache/poi/blob/fb8f17190064f39f8f35700aa6ba1fbc2cfd88e1/src/ooxml/java/org/apache/poi/xwpf/usermodel/XWPFDocument.java#L156-L156 I mean the `try-catch` block with following `IOUtils.closeQuietly(pkg);`. From the first view: - either it doesn't have sense in `XSSWorkbook` - or it also shall be used in `XWPFDocument` Do you understand what i mean? XSSFWorkbook and XWPFWorkbook were made similar via r1872453. I tried to reproduce the warnings during running unit-tests that you posted, unfortunately I could not reproduce those for now. Unfortunately the feature in commons-compress which we use here is not very well-suited for debugging such findings: it does not print any stacktrace from where the file was allocated and it depends on finalizer/garbage-collection which is unpredictable and thus makes it impossible to properly reproduce things with simple test-cases. We usually use file-leak-detector from https://github.com/kohsuke/file-leak-detector/ to check for unclosed resources. It is much better in showing details and being reproducible but commons-compress actually prevents file-leak-detector from detecting the leaks here. Therefore I have reported https://issues.apache.org/jira/browse/COMPRESS-502 to allow to disable this feature in commons-compress in the future so our own testing for file-leaks can be used fully again. However unless that happens I don't see more work possible here for now, so I am closing this for now. Please report any additional leaking in new issues so we can adjust code accordingly. |