Bug 59738

Summary: Excel Files generated using XSSFWorkbook can't be opened using Ms-Excel or OpenOffice
Product: POI Reporter: Adrodoc55 <adrodoc55>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: P2    
Version: 3.14-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Error in Excel
Difference that causes Excel to report the file as "broken"

Description Adrodoc55 2016-06-21 11:42:36 UTC
Created attachment 33969 [details]
Error in Excel

Excel Files generated using XSSFWorkbook can't be opened using Ms-Excel or OpenOffice. POI can read those files.

I used the official BusinessPlan example from the POI website: https://poi.apache.org/spreadsheet/examples.html#business-plan

The xls file generated by this example works fine in all versions, but the xlsx file can't be opened when using POI 3.11 or higher (including 3.14). POI 3.10.1 works fine.

Even an empty worbook (with just one sheet) cannot be opened when using xssf, but works fine in hssf:

String format = "xlsx";
File outputFile = new File("C:/temp/output." + format);
Workbook wb = new XSSFWorkbook();
wb.createSheet();
try (FileOutputStream fileOutputStream = new FileOutputStream(outputFile);) {
  wb.write(fileOutputStream);
}
Comment 1 Dominik Stadler 2016-06-21 13:04:32 UTC
Does it work if you specify a sheet-name in the call to createSheet()?
Comment 2 Adrodoc55 2016-06-21 15:16:48 UTC
(In reply to Dominik Stadler from comment #1)
> Does it work if you specify a sheet-name in the call to createSheet()?

No it does not.
I have tried many different variants, and multiple of the official examples and both Ms-Excel and OpenOffice can't open any xlsx files that were created by a poi version 3.11 or above.
Comment 3 Javen O'Neal 2016-06-21 16:37:27 UTC
Are you using the same version of poi, poi-ooxml, and poi-ooxml-schemas?

Does wb.write throw an exception?
Comment 4 Adrodoc55 2016-06-21 19:12:44 UTC
(In reply to Javen O'Neal from comment #3)
> Are you using the same version of poi, poi-ooxml, and poi-ooxml-schemas?
> 
> Does wb.write throw an exception?

I am using v3.14 of poi-ooxml and v1.3 of ooxml-shemas. I use Gradle to resolve transitive dependencies. I also tried all major versions between 3.9 and 3.14, but only 3.9 and 3.10.1 are working.
I don't get any Exceptions in Java, only Ms-Excel and OpenOffice are telling me that the generated xlsx files are corrupted. I don't think it is an issue with Ms-Excel or OpenOffice, because they can open other xlsx files.
Comment 5 Andreas Beeker 2016-06-22 09:50:58 UTC
First of all, this sounded too fantastic, that all versions since 3.11 can't generate usual .xlsx files - ... and actually, I've just tried the businessplan with LO/Excel/ExcelViewer, just to be sure ...

To investigate this we could either look what's wrong with the gradle build or maybe we can find the error, if you attach your resulting businessplan .xlsx of POI 3.10 and 3.14.

My guess is, that there are either duplicates jars in the classpath / dependencies or the wrong ooxml-schemas (ooxml-schemas-1.3.jar for POI 3.14 or later, ooxml-schemas-1.1.jar for POI 3.7 up to POI 3.13).

Can you package your example application and list the dependencies?

(maybe we can spot something like a wrong version of xerces/saxon/xmlbeans/xmlbeans-xpath ..)
Comment 6 Adrodoc55 2016-06-22 18:55:27 UTC
(In reply to Andreas Beeker from comment #5)
> First of all, this sounded too fantastic, that all versions since 3.11 can't
> generate usual .xlsx files - ... and actually, I've just tried the
> businessplan with LO/Excel/ExcelViewer, just to be sure ...
> 
> To investigate this we could either look what's wrong with the gradle build
> or maybe we can find the error, if you attach your resulting businessplan
> .xlsx of POI 3.10 and 3.14.
> 
> My guess is, that there are either duplicates jars in the classpath /
> dependencies or the wrong ooxml-schemas (ooxml-schemas-1.3.jar for POI 3.14
> or later, ooxml-schemas-1.1.jar for POI 3.7 up to POI 3.13).
> 
> Can you package your example application and list the dependencies?
> 
> (maybe we can spot something like a wrong version of
> xerces/saxon/xmlbeans/xmlbeans-xpath ..)

I can send you a full example project including gradle wrapper tomorrow. Did you not manage to reproduce my error?
Comment 7 Adrodoc55 2016-06-23 09:43:43 UTC
(In reply to Andreas Beeker from comment #5)
> First of all, this sounded too fantastic, that all versions since 3.11 can't
> generate usual .xlsx files - ... and actually, I've just tried the
> businessplan with LO/Excel/ExcelViewer, just to be sure ...
> 
> To investigate this we could either look what's wrong with the gradle build
> or maybe we can find the error, if you attach your resulting businessplan
> .xlsx of POI 3.10 and 3.14.
> 
> My guess is, that there are either duplicates jars in the classpath /
> dependencies or the wrong ooxml-schemas (ooxml-schemas-1.3.jar for POI 3.14
> or later, ooxml-schemas-1.1.jar for POI 3.7 up to POI 3.13).
> 
> Can you package your example application and list the dependencies?
> 
> (maybe we can spot something like a wrong version of
> xerces/saxon/xmlbeans/xmlbeans-xpath ..)

Thanks, you were right :)
I was able to track down the problem to my xalan library. I had 'xalan:xalan:2.4.0' in my classpath, because that is used in our application server. Luckily I don't actually need that dependency and can just remove it.
I still created a git repository containing everything to create corrupted xlsx files, if you want to investigate. All you have to do is open a command line and run 'gradlew run' or './gradlew run' to create a new businessplan.xlsx.
A corupted one is already checked into the repo aswell.
Please tell me when you no longer need the repo so I can delete it.
Comment 8 Dominik Stadler 2016-06-23 12:09:13 UTC
Where can the project be accessed? You can also zip it up and attach it here if you want.
Comment 9 Adrodoc55 2016-06-23 15:47:01 UTC
(In reply to Dominik Stadler from comment #8)
> Where can the project be accessed? You can also zip it up and attach it here
> if you want.

Ups I forgot to post the link: https://github.com/Adrodoc55/poi-xalan-bug
Comment 10 Dominik Stadler 2016-07-22 13:19:49 UTC
Created attachment 34061 [details]
Difference that causes Excel to report the file as "broken"

It seems with Xalan you get some different XML parser as well and this way the namespace handling is broken, see the attached image.
Comment 11 Dominik Stadler 2016-07-24 20:30:14 UTC
I did some more investigation and it looks like in your case there was an old version of the Xml Parser pulled in which does not support namespaces.

See class DocumentHelper.createDocument(), where we use classes from javax.xml.parsers to create an DOM XML Document. 

I could not find an easy way to check or force the use of namespaces here, so for now I don't think we can do much inside POI here. 

Therefore I am closing this as WONTFIX for now, please reopen if you have an idea how we can at least fail with a better error message in such cases.
Comment 12 Adrodoc55 2016-07-24 23:39:08 UTC
I am fine with this beeing closed as wontfix. Thanks to your help I was able to fix the classpath. Even if there was a way to check the namespace handling of the XML parser it's not really subject to the poi implementation to check whether the classpath is correct. I think something like notabug would be even more  appropriate than wontfix.