Created attachment 37670 [details] Memory Usage Dump Sxssf using Xssf handle its picture. So picture will save in memory, not flush with row. This means if we have a lot of pictures had to write in xlsx, this will cause memory leak which was I am struggling with. I can't find a way to flush picture data before workbook.write(). The test case is, 300kb picture, one row write 10 pictures. JVM -Xmx128m. There will be an oom when write 50 rows. In that time, the picture data list will have 60mb in memory.
Created attachment 37671 [details] Memory I count in program
Can you provide a small piece of code which reproduces the problem for you? Ideally as a self-contained unit-test so we can reproduce the problem and take a closer look?
Sorry for the late response. I have created a repository that can reproduce the problem. Run unit test with max heap size 128m. Repository URL is : https://github.com/foresx/poi-memory-leak-demo
Thanks for the detailed reproducing code, I took a look at your sample-project now. Pictures in .xlsx files are not stored per "row" or "sheet", but rather globally in a separate structure along the other parts. The current SXSSFWorkbook only flushes and removes rows based on the "rowAccesswindowSize". So flushing picture data for SXSSFWorkbook is currently not supported, we can consider adding it as an enhancement, naturally it will happen sooner if you can propose an implementation that offers this as additional option for SXSSFWorkbook in some way, however it will require some coding as you likely need to flush out pictures in a similar way as the rows and then a write-time combine the information into the final document as well.
Thanks a lot. When I have free time, I think I will have a try.
I think the issue is that we don't have a TempFilePackagePart that can optionally be used instead of MemoryPackagePart. https://poi.apache.org/apidocs/dev/org/apache/poi/openxml4j/opc/ZipPackage.html#createPartImpl-org.apache.poi.openxml4j.opc.PackagePartName-java.lang.String-boolean- This would allow us to avoid using memory (while slowing things down by using temp files). The other issue is how to configure the code so that it can choose whether to use MemoryPackagePart or TempFilePackagePart.
I added r1894203 - still experimental/beta - needs testing still - may be removed or modified
The features to change ZipPackage to use temp files to save memory will be a beta feature in POI 5.1.0