Bug 58779

Summary: Closing an XSSFWorkbook or XSLFSlideShow that was opened with File instead of FileInputStream modifies the file
Product: POI Reporter: Javen O'Neal <onealj>
Component: OPCAssignee: POI Developers List <dev>
Status: NEW ---    
Severity: normal CC: PhiLho
Priority: P2    
Version: 3.14-dev   
Target Milestone: ---   
Hardware: PC   
OS: Linux   

Description Javen O'Neal 2015-12-29 10:23:16 UTC
As noted in TestWorkbookFactory.java [1], XSSFWorkbooks that were opened with WorkbookFactory and were not opened in read-only mode will modify the file upon calling wb.close(), even if no changes were made to the workbook after creating the XSSFWorkbook object.

To reproduce this problem, replace any `revert(wb)` with `wb.close()` in TestWorkbookFactory.java and run TestWorkbookFactory.java. After running this test, the SampleSS.xlsx file will have changed, as evidenced by "svn status test-data/spreadsheet/SampleSS.xlsx"

[1] http://svn.apache.org/viewvc/poi/trunk/src/ooxml/testcases/org/apache/poi/ss/TestWorkbookFactory.java?revision=1722078&view=markup
Comment 1 Javen O'Neal 2015-12-29 10:41:22 UTC
Unit test added in r1722091

When this bug is resolved, replace closeOrRevert(wb) with wb.close().
Comment 2 Javen O'Neal 2016-01-03 09:23:00 UTC
This also affects XSLF Slide Shows created with SlideShowFactory.create(File xlsx). It's likely WorkbookFactory.create(File xlsx) and WorkbookFactory.create(File xlsx, String password) are the only two offenders from WorkbookFactory, though possibly also through the native factory, WorkbookFactory.create(OPCPackage) and the password-protected variant.

Updated unit test in r1722695 and r1722707.
Comment 3 Javen O'Neal 2016-03-27 13:05:10 UTC
XSSFWorkbook(String|File) and SXSSFWorkbook(String|File) constructors also save any changes made to the workbook to disk when calling wb.close(). See r1736742 for commented-out tests.

The documentation for OPCPackage.close explains the behavior we're seeing: "Close the open, writable package and save its content." [1]
Unfortunately, the same information isn't described in XSSFWorkbook.java, either in the various constructors [2] or in the POIXMLDocument.close [3] method that XSSFWorkbook inherits. This side-effect is too significant to let it go undocumented in XSSFWorkbook, SXSSFWorkbook, the Workbook interface, and WorkbookFactory (along with XSLF, perhaps other POI XML classes for which we don't have unit tests checking for files modified after closing). I'm not the first one who's scratched their head why files were sometimes modified from Workbook.close() operations [4].

My preference would be to change the behavior of close so that it doesn't implicitly save any changes to the workbook--an explicit OPCPackage.save/Workbook.write would be needed to save any changes. In my POI-powered applications, a user may abort midway through updating an XLSX file, and the original file should be left unmodified. To get this behavior, I'm buffering my file through a FileInputStream, but at a memory overhead. The current behavior could be viewed as a bug, because HSSFWorkbooks initialized with a non-buffered file aren't modified when the workbook is closed.
If changing OPCPackage.close's behavior to not save changes to disk would detrimentally break backwards compatibility, we would need to update our javadocs, quick guide, and examples to be very clear on this behavior.

[1] https://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/openxml4j/opc/OPCPackage.java?revision=1736721&view=markup#l397
[2] http://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/xssf/usermodel/XSSFWorkbook.java?revision=1734863&view=markup#l288 "Once you have finished working with the Workbook, you should close the package by calling {@link #close()}, to avoid leaving file handles open."
[3] http://svn.apache.org/viewvc/poi/trunk/src/ooxml/java/org/apache/poi/POIXMLDocument.java?revision=1734691&view=markup#l164
[4] centic http://svn.apache.org/viewvc/poi/trunk/src/ooxml/testcases/org/apache/poi/ss/TestWorkbookFactory.java?r1=1177409&r2=1648160
Comment 4 Javen O'Neal 2016-04-05 16:20:56 UTC
*** Bug 57161 has been marked as a duplicate of this bug. ***