Bug 27286 - OutOfMemoryError when parsing broken file
Summary: OutOfMemoryError when parsing broken file
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: HSSF (show other bugs)
Version: 2.0-FINAL
Hardware: All All
: P1 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
URL: http://www.pondokwan-klub.si/Aktualno...
Keywords:
Depends on:
Blocks:
 
Reported: 2004-02-27 07:57 UTC by Samo Login
Modified: 2005-07-28 20:57 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Samo Login 2004-02-27 07:57:18 UTC
When trying to get the content of the URL bellow, OutOfMemoryError is thrown. 

http://www.pondokwan-klub.si/Aktualno\pdk-zagorje%202003-rezultati.xls

This is my code, that uses POI:

HSSFWorkbook wb = new HSSFWorkbook(in);

, where in is InputStream reading content of the above URL
Comment 1 Bonson 2005-07-13 10:02:27 UTC
This is happening because the excel file that is being referenced is too large.
The way POI is working is that if anything needs to be done with an excel file, 
the entire file has to be read into memory first. This is where the problem is.
If the excel file is very large...  i.e. the workbook contains lots of sheets 
of data with lots of rows and columns, the ENTIRE data would first be read into 
memory. Then we can do any modifications with the Workbook object. Then we have 
to write the entire data back from memory to the excel file in one shot. You 
cannot write to a file in parts.

Unfortunately, this is the way POI works. :o(

I guess no one envisioned POI being used to such a large extent where excel 
files are toooo large that there is a shortage of memory. 

One short term solution for this problem is to increase your app server memory. 
But this is a short term solution cause as soon as the excel file becomes 
larger, the OutOfMemory Error would again be thrown. :o(

A way needs to be developed by POI guys to only read a certain amount of data 
into memory, work with it and then write it back and then take the next chunk 
of data. 

Maybe data being processed should be broken on the basis of Sheets within a 
workbook. This way the data to be read into memory is limited to a sheet.

Best Regards,
Bonson
Comment 2 Avik Sengupta 2005-07-13 13:19:03 UTC
Samo, 

Is the file linked in your description still available?  
Comment 3 Jason Height 2005-07-29 04:57:06 UTC
The sample file is long since gone to assess any performance problems.

I am going to close this bug as invalid, but Bonson's comments are right. This
is how POI works at the moment.

Jason