Bug 49472 - Poi will corrupt xls file when there is a 'Chart sheet' in the Excel file, Excel 2010
Summary: Poi will corrupt xls file when there is a 'Chart sheet' in the Excel file, Ex...
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 3.7-dev
Hardware: PC Windows Vista
: P2 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2010-06-19 16:50 UTC by Tao Jiang
Modified: 2014-12-11 19:47 UTC (History)
2 users (show)

original file and corrupted file (192.34 KB, application/x-zip-compressed)
2010-06-19 16:50 UTC, Tao Jiang

Note You need to log in before you can comment on or make changes to this bug.
Description Tao Jiang 2010-06-19 16:50:38 UTC
Created attachment 25620 [details]
original file and corrupted file

If the xls file has a 'Chart sheet', poi method Workbook.write(OutputStream) will corrupt the file. Excel 2007 and earlier does not show the corruption, but Excel 2010 will detect the corruption and show it. 

What we have done here is straightforward. We use HSSFWork's constructor, HSSFWorkbook(FileInputStream) to read a xls file with Chart sheet from disk, and directly write it back using method Workbook.write(OutputStream). The file is corrupted afterwards.

I attached a zip file here including: 1. the original xls file, 2. the corrupted file, and 3. the screen shot from Excel 2010 when opening the corrupted file.
Comment 1 Tao Jiang 2010-08-10 16:18:49 UTC
I believe that this is very critical bug.
Comment 2 Nick Burch 2010-08-11 06:24:10 UTC
If this is critical for you, then do please do some investigating on it!

What's the simplest file that will show the corruption? What's the most complex one that won't? For the simplest file that excel complains about, how does the file differ between the excel version and the poi version? What does BiffViewer show as having been changed between the two? Are there records that are missing in one? Are records in a different order? Have flags on key records been changed? If you open the file in excel 2007, and save it, can excel 2010 read it? What changed? etc
Comment 3 Mark Ingram 2010-08-19 17:42:02 UTC
I have just hit this issue with an application in Production. Currently the application uses POI (v3.6) HSSF to output Excel 2003 workbooks. With no coding changes, if I open the application output with Excel 2010 I see the red error indicator.

If I save the file using 2010 and open again then there is no error and all data appears to be there.

The same (2003) file can be opened using 2010 without error outside of the application; it's only when POI is involved that the error presents.

The workbook I'm using doesn't have chart sheets; rather, it has a bunch of sophisticated macros, named ranges, a hidden sheet of reference data and a displayed sheet of data that a user can use to enter data and select from dropdowns.

Let me know if there's anything I can do to help. (I've never debugged through POI before.)

Comment 4 Tao Jiang 2010-08-20 16:31:12 UTC

What you observed is correct. Only when POI is involved, the error will come up, and only if there is a Chart sheet in the xls. I never tried with sophisticated macros, but I do have a lot of complicated formulas, named ranges etc. all the time, and never had problem with them.

I can also open the file with Excel 2010 with the error indicator. Nothing prevents me from opening it. However, in my application, I use Office OLE automation to open the file, and it will fail for this corrupted file if Excel 2010 is installed on the computer.