Bug 44542

Summary: Cannot open a excel file in Excel 2003/2007 generated by poi
Product: POI Reporter: ivan <rusoloco>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: P2    
Version: 3.0-FINAL   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: this file open ok in excel and openoffice
this file can´t open correctly in excel it was many rows in sheet 205
this file in sheet 201 in cell B2 have a lot of words than excel can´t open
this is a extract of the code that build the excel file, error are omitted
simple test to write random data to excel file
a working test to create multiple sheets

Description ivan 2008-03-05 10:58:45 UTC
Created attachment 21629 [details]
this file open ok in excel and openoffice

Hi:

I code to generate excel files  using PO API , I've got the file OK and the most rare thing is that file is be able to open with openoffice  but with  Excel microsoft version 2000, 2003 and 2007, it can't.

When i tried to open the file excel said that the file is so damaged that it has to be repair, when you see the file repaired the only thing that is not as original file is the second row, that have the titles of the next rows.

This problem only happen when the sheet have more > 100 rows  or when in one cell, have a lot of characters 


I attached three examples files.

Is there something wrong in files that produce this error ?

Thanks
Comment 1 ivan 2008-03-05 11:00:21 UTC
Created attachment 21630 [details]
this file can´t open correctly in excel it was many rows in sheet 205
Comment 2 ivan 2008-03-05 11:02:19 UTC
Created attachment 21631 [details]
this file in sheet 201 in cell B2 have a lot of words than excel can´t open
Comment 3 Josh Micich 2008-03-05 11:23:46 UTC
I tried to open in Excel 2007, and both files give this message box:
"Excel found unreadable content in 'MuchtextInOneCell.xls'. Do you want to recover the contents of this workbook? If you trust the source of this workbook, click Yes."

I guess it's the same as the behaviour from 2003 (so I changed this bug's summary).


It would help greatly in solving this bug if you could send a small example of code that reproduces files with this characteristic.  If you are not sure how to do this, start with the code you have, and keep cutting out bits and pieces while making sure that the bugs still occur.  Hopefully you can get it down below 100 lines of code or so.
Comment 4 ivan 2008-03-05 12:51:26 UTC
Created attachment 21632 [details]
this is a extract of the code that build the excel file, error are omitted

This is a part of the code, 
it takes data for maps and put this data in rows or colums depending in number of the dataDTO

I ommited the part in which depending of the type of object it set the type of cell, because is a large switch case 


hope this can help to figurate out the problem , and the funny thing that openoffice can open this file.

greetings
Comment 5 Josh Micich 2008-03-05 15:09:05 UTC
Sorry, the code you attached is not quite what I meant.  
It still has references to other things:

CampoDTO cannot be resolved to a type
DataDTO cannot be resolved to a type
FormatoDTO cannot be resolved to a type
MetadatosService cannot be resolved to a type
MetadatosUtil cannot be resolved
ResultadoDTO cannot be resolved to a type
TipoDatoConstants cannot be resolved

Better than sending all of that stuff, replace them with in the example with whatever simple alternative that makes the code still produce the bad XLS files.

Make a static void main(String[]) method, and make this class depend on nothing but POI.

Sorry about the extra trouble, but this is probably a faster path to identifying the bug (unless someone knows an easy way to diagnose exactly how the attached XML files break the BIFF spec).
Comment 6 ivan 2008-03-05 18:55:21 UTC
Created attachment 21634 [details]
simple test to write random data to excel file

this is a fast code from original code, trying to reproduce the corrupt file 
you have to put two parameters the number of sheets and number of rows, to reproduce the error you have to put a lot of rows
Comment 7 Josh Micich 2008-03-05 21:00:23 UTC
(In reply to comment #6)
> Created an attachment (id=21634) [details]
> simple test to write random data to excel file
> this is a fast code from original code, trying to reproduce the corrupt file 
> you have to put two parameters the number of sheets and number of rows, to
> reproduce the error you have to put a lot of rows 

One more thing - in addition to reproducing the error, the code has to also compile and run.

The code you sent is a long way from running.  There are mismatched brackets (which I can't fix easily because the indenting is all wrong).  Once I got it executing, I got a ClassCastException from this expression: (Map<String, Object>) formato.get("data");

There are all sorts of apparent errors where the code keeps updating a map with the same key and value (map only gets one entry) or the same map instance is added to a list many times.

After attempting to fix all of this I ran the code with (hojas=1000, renglones=256) and  the spreadsheet still opened in Excel OK.  Perhaps my fixes obliterated some important part of the code.


I'm sorry if this is tedious, but it really helps to have some example code that allows other people to easily reproduce your error.  Can you please re-attach the example class after making sure that it actually compiles, runs, and reproduces the error.
Comment 8 Yegor Kozlov 2008-03-06 07:15:39 UTC
1. I can't reproduce it either. I created a test based on the attached code and it produces correct XLS files.

What is strange to me is that the generated XLS is just data. There are no rich text, formulas, graphics, any of advanced XLS/POI features, etc. 
I can't see why POI would produce wrong output. 

2. If I try to read any of the problem files in POI I get the same exception:

Exception in thread "main" org.apache.poi.hssf.record.RecordFormatException: Unable to construct record instance
	at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:204)
	at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:122)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:204)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:256)
	at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:237)
	at org.apache.poi.hssf.scratchpad.Test44542.main(Test44542.java:47)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at com.intellij.rt.execution.application.AppMain.main(Unknown Source)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
	at org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:192)
	... 10 more
Caused by: java.lang.NegativeArraySizeException
	at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:44)
	at java.lang.StringBuffer.<init>(StringBuffer.java:92)
	at org.apache.poi.hssf.record.UnicodeString.fillFields(UnicodeString.java:225)
	at org.apache.poi.hssf.record.UnicodeString.<init>(UnicodeString.java:115)
	at org.apache.poi.hssf.record.SSTDeserializer.manufactureStrings(SSTDeserializer.java:49)
	at org.apache.poi.hssf.record.SSTRecord.fillFields(SSTRecord.java:364)
	at org.apache.poi.hssf.record.Record.<init>(Record.java:55)
	at org.apache.poi.hssf.record.SSTRecord.<init>(SSTRecord.java:101)
	... 15 more


So, if it is a POI bug it might be related to SSTRecord. 

Yegor
Comment 9 ivan 2008-03-06 17:33:21 UTC
Created attachment 21645 [details]
a working test to create multiple sheets 

Sorry  about last code, this code is fully operational, with this test the memory is exausted, but i can't reproduce the wrong files i make atest wtih 250 rows and 100 sheets but these are only strings cells 

maybe the problem is when i'm setting multiple cells formats in each sheet 


greetings