Bug 57842

Summary: Using POI 3.9 API memory consumed reading an xlsx file is not released back to the operating system after completion
Product: POI Reporter: pcllau
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED MOVED    
Severity: normal    
Priority: P2    
Version: 3.9-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: org.eclipse.mat report

Description pcllau 2015-04-21 14:54:34 UTC
Created attachment 32673 [details]
org.eclipse.mat report

I am running a web application on Apache Tomcat version 7.0.56, using jdk 1.7.0_45, spring 3.1.0.RC2, and experiencing what I think is a memory leak in the POI API version 3.9, or one of it's dependencies.

When using POI 3.9 API memory consumed reading an xlsx file is not released back to the operating system after completion.

org.eclipse.mat reports the following information (See attachment for full details):

The thread org.apache.tomcat.util.threads.TaskThread @ 0x5eca3afb0 http-bio-8091-exec-10 keeps local variables with total size 68,205,584 (35.67%) bytes.
The memory is accumulated in one instance of "byte[]" loaded by "<system class loader>".
Comment 1 Nick Burch 2015-04-21 23:52:51 UTC
Please re-test with at least 3.12 beta 1, or ideally a build from svn / nightly build. There have been lots of bugs fixed (including memory related ones) since 3.9 came out in 2012
Comment 2 pcllau 2015-04-22 12:11:42 UTC
Failed test case information below:

Used following to read xlsx spreadsheet files:

  // XSSFWorkbook, File
  OPCPackage pkg = OPCPackage.open(new File("file.xlsx"));
  XSSFWorkbook wb = new XSSFWorkbook(pkg);
  ....
  pkg.close();
  // XSSFWorkbook, InputStream, needs more memory
  OPCPackage pkg = OPCPackage.open(myInputStream);
  XSSFWorkbook wb = new XSSFWorkbook(pkg);
  ....
  pkg.close();

Versions POI tested on:

3.9
3.11
3.12-beta1

Webserver: Apache Tomcat 7.0.61
JDK: 1.7.0_79
Comment 3 pcllau 2015-04-22 13:36:53 UTC
To clarify, I have confirmed that I am experiencing the same memory leak issue for POI versions 3.9, 3.11, 3.12-beta1. After calling new XSSFWorkbook(file), the consumed memory is not released back to the operating system
Comment 4 pcllau 2015-04-22 15:28:50 UTC
My Apache Tomcat server stderr logs the following issue below which may be closely linked to an existing bugzilla issue https://issues.apache.org/jira/browse/XMLBEANS-502

Apr 22, 2015 3:58:12 PM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks
SEVERE: The web application [/ProVista] created a ThreadLocal with key of type [org.apache.xmlbeans.XmlBeans$1] (value [org.apache.xmlbeans.XmlBeans$1@29c9fb0e]) and a value of type [java.lang.ref.SoftReference] (value [java.lang.ref.SoftReference@56d53bd3]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.
Comment 5 Nick Burch 2015-04-23 02:10:29 UTC
Try calling close on the workbook, and also ensure that no references to the workbook or sheets within it remain when you're done processing
Comment 6 pcllau 2015-04-23 12:23:33 UTC
Test-case-1:

Configuration:
    Apache Tomcat 7.0.61
    JDK 1.7.0_79
    POI-3.12-beta1
Code: 
    // open a 6.2MB xlsx file containg 910,000 rows data 
    Workbook wb = new XSSFWorkbook(new File("file_name")); 
    wb.close(); 

Test-case-1 result: memory-use: 1.9GB (doesn't release after closing workbook). On subsequently re-opening the same workbook the memory use remains at 1.9GB, and tomcat stderr logs the following severe messages:
Apr 23, 2015 1:12:24 PM org.apache.catalina.loader.WebappClassLoader checkThreadLocalMapForLeaks
SEVERE: The web application [/ProVista] created a ThreadLocal with key of type [org.apache.xmlbeans.XmlBeans$1] (value [org.apache.xmlbeans.XmlBeans$1@3c59c852]) and a value of type [java.lang.ref.SoftReference] (value [java.lang.ref.SoftReference@2b19a30e]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak.


Test-case2:

Configuration:
    Apache Tomcat 7.0.61
    JDK 1.7.0_79
    POI-3.11
Code: 
    // open a 6.2MB xlsx file containg 910,000 rows data 
    Workbook wb = new XSSFWorkbook(new File("file_name")); 
    wb.close(); 

Test-case-2 result: memory-use: 1.8GB (doesn't release after closing workbook). On subsequently re-opening the same workbook the memory use remains at 1.8GB
Comment 7 Javen O'Neal 2017-01-19 08:15:04 UTC
Allow the workbook and anything it references to be garbage collected
> wb = null
or just let wb fall out of scope.

Are you able to get a refcount on POI objects?
Comment 8 Dominik Stadler 2019-03-10 17:42:45 UTC
We never got more detailed information, so it is hard to "fix" anything for this specific report. 

We should however try to make it possible to clear thread locals so no memory leaks are caused by accumulating content in threads in an application which uses web-container or thread-pools.

There is an issue for XMLBeans to allow to free the threadlocals at  https://issues.apache.org/jira/browse/XMLBEANS-502
Comment 9 Dominik Stadler 2019-03-10 17:46:22 UTC
The current remainder of this bug needs to be handled over at XMLBeans via https://issues.apache.org/jira/browse/XMLBEANS-502, so closing this one for now. 

We are running huge regression tests where millions of documents are processed in one application, so a simple memory like indicated here is unlikely to be present in recent versions of Apache POI. Please report a new bug if you have a more detailed result of checking for memory leaks.