Bug 60977

Summary: Adding Properties create invalid .xlsx file on second write
Product: POI Reporter: therock
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: Uwe.Herrmann
Priority: P2    
Version: 3.15-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Bug Depends on:    
Bug Blocks: 51971    
Attachments: Example created file
Reproducing testcase, open the resulting files

Description therock 2017-04-12 19:13:40 UTC
Created attachment 34909 [details]
Example created file

When adding a custom property, and invalid .xlsx is created.

using the java code:

try (final XSSFWorkbookworkbook = new XSSFWorkbook()) {
   final POIXMLProperties properties = workbook.getProperties();
   final POIXMLProperties.CustomProperties customProperties = properties.getCustomProperties();
   customProperties.addProperty("Project", project.getName());

   workbook.write(outputStream);

   try (final java.io.FileOutputStream fs = new java.io.FileOutputStream("C:\\temp\\temp.xlsx")) {
      workbook.write(fs);
   }

   
 When you try to open temp.xlsx in Excel, you get a "file is corrupted" warning.
 
 Upon disassembling the .xlsx file, I found the following incorrect custom.xml file in docProps:
 
 <?xml version="1.0" encoding="UTF-8"?>
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties" xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"><property pid="2" fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" name="Project"><vt:lpwstr>test</vt:lpwstr></property></Properties><?xml version="1.0" encoding="UTF-8"?>
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/custom-properties" xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"><property pid="2" fmtid="{D5CDD505-2E9C-101B-9397-08002B2CF9AE}" name="Project"><vt:lpwstr>test</vt:lpwstr></property></Properties>

Note the file seems to be written twice.
Comment 1 Javen O'Neal 2017-04-13 00:06:16 UTC
> workbook.write(outputStream);
> workbook.write(fs);

Might be because the workbook is being written out twice (which shouldn't be an issue unless the output streams map to the same file).

Could you try this in the latest trunk build?
Comment 2 therock 2017-04-25 19:00:14 UTC
Ok, I finally had some time to revisit this.  I was calling workbook.write twice, once to write to a memory stream, and again (only in debug) to save to a temp file for validation.

I removed the 2nd call and instead just streamed the byte buffer to the temp file, and it now works properly.

Still seems to me though this is a bug.  Shouldn't you be able to call .write multiple times without corrupting the spreadsheet?
Comment 3 Javen O'Neal 2017-04-26 01:25:47 UTC
Were the 2 streams that you wrote to connected in some way?

Can you post a short example of what you were doing, specifically the opening of the workbook, the creation of each OutputStream and the closure of the workbook and both output streams?

If you read the workbook in from a file, was that the same file as one of the output streams.

I'm trying to get an idea where we should start for unit testing this. I agree that concatenating the contents of an xml file twice should not happen and result in an error, so long as it's within POI's ability to do so.
Comment 4 Dominik Stadler 2017-09-20 07:19:18 UTC
I narrowed this down to MemoryPackagePart accumulating the data across multiple writes via ZipPackage. However I did not see an easy fix as this is also used on the read-side and in many other places, so a simple clean in there did not work...
Comment 5 Dominik Stadler 2017-09-21 15:43:28 UTC
Created attachment 35357 [details]
Reproducing testcase, open the resulting files
Comment 6 Dominik Stadler 2018-04-19 08:48:34 UTC
*** Bug 61734 has been marked as a duplicate of this bug. ***
Comment 7 Greg Woolsey 2019-01-11 18:23:18 UTC
Fixed in trunk revision r1851084.  I hit this yesterday, and it irritated me enough to track it down.

Turns out POIXMLProperties has multiple write cases depending on internal state.  In one case only, the one hit by this test/example, it needed to clear the package part before writing, otherwise it would append the full current content instead.