Bug 27571 - POI corrupts Excel file beyond repair
Summary: POI corrupts Excel file beyond repair
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.0-dev
Hardware: PC Windows XP
: P3 major with 4 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-10 15:38 UTC by Kim Madsen
Modified: 2007-03-23 10:22 UTC (History)
1 user (show)



Attachments
Excel spreadsheet before loading into POI (174.50 KB, application/octet-stream)
2004-03-10 15:40 UTC, Kim Madsen
Details
Excel spreadsheet after being written back by POI (176.50 KB, application/octet-stream)
2004-03-10 15:41 UTC, Kim Madsen
Details
the file before writing with poi (16.50 KB, application/vnd.ms-excel)
2005-11-24 11:14 UTC, juergen
Details
and after writing with poi (17.50 KB, application/vnd.ms-excel)
2005-11-24 11:14 UTC, juergen
Details
Java code to reproduce problem (1.08 KB, application/octet-stream)
2006-08-03 13:38 UTC, oyvind.harboe
Details
Input to TestPOI.java testcase (13.00 KB, application/octet-stream)
2006-08-03 13:39 UTC, oyvind.harboe
Details
Input Excel Sheet (294.50 KB, application/vnd.ms-excel)
2007-03-23 09:53 UTC, jbazzo
Details
Output Excel Sheet (336.50 KB, application/vnd.ms-excel)
2007-03-23 09:56 UTC, jbazzo
Details
Modified file with worksheet removed (265.50 KB, application/vnd.ms-excel)
2007-03-23 09:57 UTC, jbazzo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kim Madsen 2004-03-10 15:38:31 UTC
A spredsheet file is read into POI and written back with no changes. However,
once reopened in Excel, gives the below error:
 "Errors were detected in 'POIReadWriteTest.xls,' but Microsoft Excel
 was able to open the file by making the repairs listed below. Save
 the file to make these repairs permanent.
 
 Damage to the file was so extensive that repairs were not possible.
 Excel attempted to recover your formulas and values, but some data may 
 have lost or corrupted."

The bug seems similar to 19054 and 18155, but these bugs has status as fixed,
and this happened using both poi-1.5.1-final and poi-2.5-final.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19054
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18155

The test for this is really simple. I have included the test code below. I shall
be happy to supply the spreadsheet in the before and after version if required.

package com.inceptor.thirdparty;

import junit.framework.Test;
import junit.framework.TestSuite;
import junit.framework.TestCase;

import java.io.FileInputStream;
import java.io.FileOutputStream;

import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;

/** Junit test case for the thread support library.
 *  This demonstrates a bug in POI.
 *  A spredsheet is read and written back without any changes.
 *  However, the written file contains error once opened in Excel:
 *
 *  "Errors were detected in 'POIReadWriteTest.xls,' but Microsoft Excel
 *  was able to open the file by making the repairs listed below. Save
 *  the file to make these repairs permanent.
 *
 *  Damage to the file was so extensive that repairs were not possible.
 *  Excel attempted to recover your formulas and values, but some data may 
 *  have lost or corrupted."
 *
 *  Notice does not happen for all spreadsheets.
 *  It may only be happening on spreadsheets over a certain size,
 *  or spreadsheets with VB code.
 *
 * @version $Id: POIReadWriteTest.java,v 1.4 2004/03/10 12:01:45 kim Exp $
 */
public class POIReadWriteTest extends TestCase {
    
    public POIReadWriteTest(String name) {
	super(name);
    }
    
    public void testReadWrite() throws Exception {
	POIFSFileSystem fs = new
            POIFSFileSystem(new 
		FileInputStream("src/test/etc/imp/com/inceptor/imp/lawuers.xls"));
	HSSFWorkbook wb = new HSSFWorkbook(fs);
	FileOutputStream fileOut = new 
	    FileOutputStream("src/test/etc/poi/POIReadWriteTest.xls" );
	wb.write( fileOut );
	fileOut.close();
    }
    
    public static void main(String args[]) { 
	junit.textui.TestRunner.run(suite());
    }
    
    public static Test suite() {
	return new TestSuite(POIReadWriteTest.class);
    }
}
Comment 1 Kim Madsen 2004-03-10 15:40:25 UTC
Created attachment 10745 [details]
Excel spreadsheet before loading into POI
Comment 2 Kim Madsen 2004-03-10 15:41:35 UTC
Created attachment 10746 [details]
Excel spreadsheet after being written back by POI
Comment 3 Andy Oliver 2004-03-10 20:03:44 UTC
can you try with 2.0?  This is probably two completely different problems.  We
anticipated some Read-Write problems with 2.5 which is why we released it so
shortly after 2.0.  
Comment 4 Kim Madsen 2004-03-11 09:49:48 UTC
I just tried it using poi-bin-2.0-RC2-20040102 and it worked fine. 
Thanks for the tip.
Comment 5 Danny Mui 2004-03-19 17:43:02 UTC
Upgrading corrected issue.
Comment 6 mariya.klimenko 2004-09-21 13:52:57 UTC
File is corrupted after being read by poi. Still readable but it looks like 
there are two panes overlaying each other, some cell borders are missing.  
Comment 7 Jason Height 2004-09-27 02:59:53 UTC
Using latest CVS code (2.5.1) visually confirmed that the workbook from POI is
identical to the original excel one.
Comment 8 juergen 2005-11-24 11:03:53 UTC
I'm experiencing the same problem with poi 2.5.1-final.

how can I investigate this?
Comment 9 juergen 2005-11-24 11:07:01 UTC
additional info:
i don't only read and write the workbook, i also clone and remove worksheets,
rename them and write contents of cells.

i'll try to create the excel files to attach them.
Comment 10 juergen 2005-11-24 11:14:28 UTC
Created attachment 17028 [details]
the file before writing with poi
Comment 11 juergen 2005-11-24 11:14:58 UTC
Created attachment 17029 [details]
and after writing with poi
Comment 12 juergen 2005-11-25 10:45:43 UTC
this is the "log", excel writes when i open the file.

Microsoft Office Excel File Repair Log

Errors were detected in file 'ManualTest.xls'
The following is a list of repairs:

Removed one or more invalid formulas.
Comment 13 Jason Height 2005-12-15 23:23:32 UTC
It would be easier if you provided a java class that opened the file, performed
the operations you are having trouble with and then wrote it out.

Jason
Comment 14 oyvind.harboe 2006-08-03 13:37:59 UTC
I see this problem with POI 2.5.1 final 20040804.jar
Comment 15 oyvind.harboe 2006-08-03 13:38:31 UTC
Created attachment 18674 [details]
Java code to reproduce problem
Comment 16 oyvind.harboe 2006-08-03 13:39:35 UTC
Created attachment 18675 [details]
Input to TestPOI.java testcase

When I try to open the resulting test.xls generated by TestPOI.java, I get an
error message in Excel 2003:



Microsoft Office Excel File Repair Log

Errors were detected in file 'C:\workspace\qpbtapestry\test.xls'
The following is a list of repairs:

Removed one or more invalid formulas.
Comment 17 Jason Height 2006-08-28 04:00:18 UTC
Øyvind your test case works fine in latest SVN code.

juergen it appears that the removal of sheets and cloning is cause of the issue.
A Bug like 29619 is more suited. (Plus you never gave us java code which did the
cloning and removal that cprrupted your example)

The original issue was a pure read and write issue which as indicated was corrected.

Marking this bug fixed. The other comments on the bug are covered by existing bugs.

Think that sheet corruption with removal and cloning is the next thing to look at.

Jason
Comment 18 jbazzo 2007-03-23 09:51:38 UTC
Hi, I am still facing this bug. I tried with version 2.5 and the
3.0beta(20061212) and the problem remained, i.e., the generated file is corrupted.

A weird thing happened when I removed the first worksheet (tab), the generated
file was NOT corrupted.
I was thinking that the problem was with the MACROS but it seems not.

See the details:
1) Java code:
...
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("C:\\table.xls"));
HSSFWorkbook wb = new HSSFWorkbook(fs);
FileOutputStream fileOut = new FileOutputStream("C:\\table_output.xls");
wb.write(fileOut);
fileOut.close();
...

2) The input file attached table.xls

3) The output file attached table_output.xls

4) The modified file with the first worksheet removed
table_removed_first_worksheet.xls.

Thanks,
Juliano
Comment 19 jbazzo 2007-03-23 09:53:26 UTC
Created attachment 19789 [details]
Input Excel Sheet
Comment 20 jbazzo 2007-03-23 09:56:07 UTC
Created attachment 19790 [details]
Output Excel Sheet
Comment 21 jbazzo 2007-03-23 09:57:40 UTC
Created attachment 19791 [details]
Modified file with worksheet removed
Comment 22 jbazzo 2007-03-23 10:22:38 UTC
I have just tested with the 3.0RC1-20070311 and the problem didn't happen anymore. 
Changing the state to RESOLVE.... Thanks. Juliano