Bug 47847

Summary: [PATCH] Read XLS via HSSFWorkbook and write back, file unreadable by MS Excel 2003
Product: POI Reporter: henry <CHuang3>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major    
Priority: P2    
Version: 3.7-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: The test file
patch to fix bug 47847

Description henry 2009-09-16 00:03:39 UTC
Read the XlS file via HSSFWorkbook and do nothing except write the workbook back, then the file goes unreadable by MS Excel 2003.

The following is the code:

FileInputStream fis = null;
HSSFWorkbook wb = null;
try{
	File f = new File("testUnreadable.xls");
	if (f.exists()){
		fis = new FileInputStream(f);
		wb = new HSSFWorkbook(fis);
	}
}catch (Throwable th){}
finally{
	if (fis != null){
		fis.close();
	}
}
FileOutputStream stream = null;
try{
	stream = new FileOutputStream("New.xls");
	wb.write(stream);
}catch (Exception e){}
finally{
	if (stream != null)
	{
		stream.close();
	}
}

when opening the generated "new.xls" file with MS Excel 2003, the following error message pops up:

"Excel found unreadable content in "testUnreadablenew.xls". Do you want to recover the contents of this workbook?..."

Is there any work around? Thanks a lot.
Comment 1 henry 2009-09-16 00:06:27 UTC
Created attachment 24273 [details]
The test file

The file fails to do the read and write.
Comment 2 pierre tholence 2010-01-13 04:58:10 UTC
Created attachment 24836 [details]
patch to fix bug 47847

This patch fix the problem for the test file.
The problem was coming from a wrong assumption on the ExtRst field.
Comment 3 henry 2010-01-14 18:23:01 UTC
Verify the provided patch, the problem has been fixed:-)

Thanks a lot.
Comment 4 Nick Burch 2010-01-19 04:05:39 UTC
I've added support to svn for the ExtRst part of unicode strings. This should hopefully fix the problem, because we now parse the data so can split on more helpful positions.
Comment 5 henry 2011-03-22 21:33:57 UTC
Verified on POI version 3.7, the issue still exists.

Will this fix be included in the POI formal release?
Comment 6 David Fisher 2011-03-22 22:50:32 UTC
Nick reported the issue as resolved AFTER 3.7 was released.

Does POI 3.8 beta 1 work? Does trunk work?
Comment 7 David Fisher 2011-03-22 22:54:47 UTC
Sorry, I saw 2011 and not 2010 in Nick's comments.

It sure has been awhile. What happens if you do a "save as" in Excel with that file?

Also, what produced your original file?

Regards,
Dave
Comment 8 henry 2011-03-22 23:18:05 UTC
(In reply to comment #7)
> Sorry, I saw 2011 and not 2010 in Nick's comments.
> 
> It sure has been awhile. What happens if you do a "save as" in Excel with that
> file?
> 
> Also, what produced your original file?
> 
> Regards,
> Dave

The file could be saved successfully. However, trying the above code on the new save file will still generate a corrupted file. 

The file is created manually in MS Excel 2003.

The patch provided by Nick works fine for POI-3.5, however in POI-3.7, the implementation of the problem class "UnicodeString" seems to be totally changed. So I am not sure whether the changes cause the issue again.

Thanks.
Comment 9 Yegor Kozlov 2011-03-23 08:42:26 UTC
I've just tried your code with POI trunk and it works correctly. The output is readable by Excel 2003 and Excel 2010. 

Yegor