Bug 58915

Summary: memory eaters
Product: POI Reporter: givemejob
Component: SXSSFAssignee: POI Developers List <dev>
Severity: trivial    
Priority: P2    
Version: 3.13-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: memory eaters
10min monitoring
1hour monitoring

Description givemejob 2016-01-23 12:31:55 UTC
Created attachment 33480 [details]
memory eaters

We did stress test:
 - Tomcat 7 Xmx100m
 - JVM 7
 - on request we start 10 threads and each thread writes large xlsx file with 10 rows sliding window (SXSSF) to the file

Library we use:
 - jxls 2.8
 - jxls-poi 1.0.7
 - Apache POI 3.13 (poi, poi-ooxml, poi-ooxml-schemas)
 - xmlbeans 2.6.0

It was assumed that window should reduce memory consumption to volume memory needed for window. 

For large xlsx files we can see what objects Xobj$ElementXobj, Xobj$AttrXobj, STRefImpl, CTMergeCellImpl eat memory.
Comment 1 givemejob 2016-01-23 12:33:38 UTC
Created attachment 33481 [details]
10min monitoring
Comment 2 givemejob 2016-01-23 12:34:15 UTC
Created attachment 33482 [details]
1hour monitoring
Comment 3 givemejob 2016-01-25 10:54:33 UTC
Looks like bug of jxls 2.2.8
Wrong version of jxls in the first comment. (2.8 -> 2.2.8)
Comment 4 givemejob 2016-01-27 12:29:19 UTC
POI SXSSF uses SXSSFSheet for excel sheet representation.

SXSSFSheet is wraper for XSSFSheet with 'overrided' createRow method. (support for row flush capability)

Other methods of SXSSFSheet delegates call to XSSFSheet.

Therefore all information about workbook are stored in memory except rows (cell value and etc)
For example SXSSFSheet.addMergedRegion add CellRangeAddress object in memory when CellRangeAddress count not constant value. On big data this produce chart like "1hour monitoring".

–°all of SXSSFWorkbook.write method marshal java objects (except rows) to ByteArrays (see MemoryPackagePartOutputStream) therefore it produce memory consumption jump at the end of SXSSFWorkbook.write call.

If reasoning is correct then in my opinion this is should be pointed into "Quick Guide".
I did not find the information until not investigate it.
Comment 5 Dominik Stadler 2016-03-29 17:31:33 UTC
Adjusted javadoc in r1737024 and the page at https://poi.apache.org/spreadsheet/how-to.html#sxssf via r1737025.
Comment 6 Dominik Stadler 2016-03-29 18:36:56 UTC
Correct revision for the webpage is r1737028.