Bug 46774 - Extreme memory usage in XSSF workbook
Summary: Extreme memory usage in XSSF workbook
Status: RESOLVED WONTFIX
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.5-dev
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-26 14:48 UTC by Rob W
Modified: 2009-02-27 06:29 UTC (History)
0 users



Attachments
Sample code that illustrates the bug (5.98 KB, text/plain)
2009-02-26 15:08 UTC, Rob W
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rob W 2009-02-26 14:48:15 UTC
Using HSSF, a simple test of 10000 rows by 255 columns runs fine with "-Xms128m -Xmx394m" and is fast.

Using XSSF, the same exact test fails before even getting to 1000 rows, and is extremely slow. After bumping the memory to "-Xms512m -Xmx2048m", it fails somewhere between 3000 and 4000 rows.

Please see the attached sample code, which is based on your Timesheet Demo.

Thank you,
Rob
Comment 1 Rob W 2009-02-26 14:50:35 UTC
This may not be a Macintosh-only problem.  I am hoping to test on another platform shortly.
Comment 2 Rob W 2009-02-26 15:04:13 UTC
Just tested with the same result under Windows Vista, so this does not appear to be a platform-specific issue.
Comment 3 Rob W 2009-02-26 15:08:56 UTC
Created attachment 23317 [details]
Sample code that illustrates the bug

I am resubmitting my original attachment (unchanged), as it isn't appearing in the bug.  If there is a delay and I've submitted it twice, I apologize.
Comment 4 Nick Burch 2009-02-27 04:45:54 UTC
XSSF is xml based, so processing the files will always take more memory than using HSSF. Also, in the interests of developer time, we use XML Beans, which allows faster development at the expense of more memory used.

If this is proving to be a problem for you, do please do some profiling to identify the heavy memory use areas, and contribute back patches to reduce the memory use!
Comment 5 Yegor Kozlov 2009-02-27 06:29:32 UTC
There was a discussion about it some time ago. See http://markmail.org/thread/vqut6wy3ashguz6x

A possible workaround is to stream your data directly in XML.  See an example demonstrating my idea: 
http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/usermodel/examples/BigGridDemo.java

Yegor