Bug 60140

Summary: OOM caused by Memory Leak in FileBackedDataSource
Product: POI Reporter: Luis Filipe Nassif <lfcnassif>
Component: POIFSAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: major CC: jmarkus+apache01
Priority: P2    
Version: 3.15-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Luis Filipe Nassif 2016-09-15 00:13:46 UTC
Investigating TIKA-2058, we discovered HeapByteBuffers are being cached unnecessarily into buffersToClean, because they need no special unmapping, when datasource is not writable.

A single instance of FileBackedDataSource consumed 5.7GB of heap, triggering OOM.

More details on https://issues.apache.org/jira/browse/TIKA-2058

Patch will be attached.
Comment 1 Tim Allison 2016-09-15 00:20:26 UTC
r1760816

Thank you!
Comment 2 Luis Filipe Nassif 2016-09-15 00:35:06 UTC
POI is supposed to support/write to files larger than 2GB? If not, I can propose a new patch to reduce the number of mmapping when the file is writable.
Comment 3 Dominik Stadler 2016-09-15 07:13:45 UTC
Yes, it should be able, although we have at least one bug-entry stating that some versions of zip-implementations cause issues when opening the zipped-XML-based file formats. 

Please create a separate Bug and attach the patch there so we can discuss it post-3.15 release.
Comment 4 Marcus Lundblad 2017-01-26 15:12:05 UTC
Luis Filip Nassif:

Hi did you have any progress on the patch to reduce the number of mmappings?
We get some OOM exception in FileBackedDataSource.

Trying to create an NPOIFileSystem like this:

result = new NPOIFSFileSystem(file, false);

Then reading entries from the file to compute a hash over all content and at the end appending an additional DocumentEntry.

But we get an OOMException when reading the data (the data is read piece-by-piece into a 1024 byte buffer and is not kept around).

Caused by: java.io.IOException: Map failed
	at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:940) [rt.jar:1.8.0_111]
	at org.apache.poi.poifs.nio.FileBackedDataSource.read(FileBackedDataSource.java:94) [poi-3.16-beta1.jar:3.16-beta1]
	at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:484) [poi-3.16-beta1.jar:3.16-beta1]
	at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169) [poi-3.16-beta1.jar:3.16-beta1]
	... 85 more
Caused by: java.lang.OutOfMemoryError: Map failed
	at sun.nio.ch.FileChannelImpl.map0(Native Method) [rt.jar:1.8.0_111]
	at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:937) [rt.jar:1.8.0_111]
	... 88 more
Comment 5 Luis Filipe Nassif 2017-01-30 03:09:40 UTC
Hi Marcus,

No, I have not tried to write the patch, because the need to handle files larger than 2GB.

Are you using a x64 jvm? Have you tried to increase ulimit system setting?