Bug 51444

Summary: Writing XSSF to stream produces unreadable content when XSSF is read from stream
Product: POI Reporter: milan <milan.chudik>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 3.7-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Junit test file, workbook.xls,workbook.xlsx

Description milan 2011-06-28 09:51:50 UTC
Created attachment 27221 [details]
Junit test file, workbook.xls,workbook.xlsx

hi,

I just experience strange fact that is similar to bug 51158, but not really the same.

When I create workbook "by hand" via new operator, I can store XSSFWorkbook to a file(outputstream), that works ok(I don't need to store it twice-though it yields error too)
But when I read a workbook from a stream, let's say file, and store it to another file, virtually creating a copy, then the result file is unreadable.

Use case is, that I want to report only one sheet(current) I'm working on, then I create a XSSFWorkbook from InputStream, remove unnecessary sheets and then store it via write method to a new OutStream and passes that stream further for processing. BUT this stream is unreadable, if I want to read from this stream I get this exception:

ava.lang.NullPointerException
        at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:256)
        at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:186)
        at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:182)
        at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:63)

If I write it to a file, file is unreadable.
I debugged it it I find out, that writing fails on "Dublicate entry: docProps/core.xml" in I think DefaultMarshaller(exception is caught in org.apache.poi.openxml4j.opc.ZipPackage.saveImpl)

I attach test file that demonstrates issues and example excel files.

I experience this error both in 3.7 and 3.8-beta3
Regards,

Milan
Comment 1 Yegor Kozlov 2011-06-28 13:36:36 UTC
The problem occurs only with attached workbook.xlsx. If I create this file with the testCreateXLSXFile() method and pass the result to testReadXLSXFileByeArray() then the test succeeds. 

How did you create workbook.xlsx ? Did you re-save it in Excel after creation? 

Yegor
Comment 2 milan 2011-06-28 14:14:38 UTC
both files are created in libraoffice(former openoffice)

yes I just saved it again as OpenXML file.

Hmm, maybe there is a problem with OpenOffice implementation?

Still how can I ensure I will not receive such a file from a third party?
(we try to process XLS and XLSX files sent from external sources, so we have no control over these files, how they are populated, etc)
And that file can be reopened in OpenOffice as a valid file. So it looks good.

I wrote a workaround that copies XSSFworkbook, sheet by sheet, cell by cell, so I can manage so far, but it would be nice to have same solution both for XSL and XLSX files
Comment 3 Yegor Kozlov 2011-06-28 15:22:44 UTC
(In reply to comment #2)
> both files are created in libraoffice(former openoffice)
> 
> yes I just saved it again as OpenXML file.
> 
> Hmm, maybe there is a problem with OpenOffice implementation?
> 
> Still how can I ensure I will not receive such a file from a third party?
> (we try to process XLS and XLSX files sent from external sources, so we have no
> control over these files, how they are populated, etc)
> And that file can be reopened in OpenOffice as a valid file. So it looks good.
> 
> I wrote a workaround that copies XSSFworkbook, sheet by sheet, cell by cell, so
> I can manage so far, but it would be nice to have same solution both for XSL
> and XLSX files

It looks like POI has a problem re-saving files from LibreOffice.

I suspect that the following differences in [Content_Types].xml cause it:

LibreOffice:
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
  <Override PartName="/_rels/.rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
  <Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlformats-package.core-properties+xml"/>


POI:
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
  <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
  <Default Extension="xml" ContentType="application/xml"/>

It might be that LibreOffice is correct and POI does not handle a specific OOXML case. 

I'm going to install LO and research it.

Yegor
Comment 4 milan 2011-06-28 16:19:07 UTC
thanks, appreciate
Comment 5 Yegor Kozlov 2011-06-30 15:40:55 UTC
Fixed in r1141576, junit added

Yegor
Comment 6 milan 2011-06-30 16:25:26 UTC
so when do you think this fix will be available in any official release(besides from checkout from repository)? 
maybe in 3.8-beta4?