|Summary:||IllegalArgumentException when initializing NPOIFSFileSystem object|
|Component:||POIFS||Assignee:||POI Developers List <dev>|
Description mskan 2014-04-23 07:53:25 UTC
I am using POIFS to extract data from OLE2 files generated by an X-ray scanner. Opening small files works fine with both POIFSFileSystem and NPOIFSFileSystem. However, when I try to open large files (say, a few GB or larger), I get an OutOfMemory exception with POIFSFileSystem, and with NPOIFSFileSystem, I get the following exception: java.lang.IllegalArgumentException at java.nio.ByteBuffer.allocate(ByteBuffer.java:330) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:275) Could this be a bug in NPOIFSFileSystem?
Comment 1 Nick Burch 2014-04-23 08:38:10 UTC
Any chance you could attach a debugger, and check the values in NPOIFS of maxSize (line 274) and _header.getBATCount()?
Comment 2 mskan 2014-04-23 10:00:55 UTC
I have now created a minimal example that throws the following exception: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:273) at TestApp.main(TestApp.java:16) The values in NPOIFS are: _header.getBATCount() = 493 maxSize = 2067795968 The size of the file that I am opening is 3,365,793,792 bytes. (In reply to Nick Burch from comment #1) > Any chance you could attach a debugger, and check the values in NPOIFS of > maxSize (line 274) and _header.getBATCount()?
Comment 3 Nick Burch 2014-04-23 11:19:06 UTC
The NPOIFSFileSystem constructor that takes an InputStream buffers the whole file into memory. So, your heap space needs to be at least the size of the file, plus a bit extra. Nothing we can do to help there - you just have to increase your heap Alternately, NPOFSFileSystem has a constructor that takes a File, that has a much much lower memory footprint as a File allows for Random Access (InputStream does not) If you increase your heap to something like 15% bigger than the file, does it work with an InputStream? If you switch to a File, does that fix it without a bigger heap?
Comment 4 mskan 2014-04-23 12:30:05 UTC
I realize that the numbers that I gave you before were for a file of size 2,066,718,720 bytes. Increasing the heap size (using -Xmx3G) helps when opening the file as an InputStream, and it also works with the standard heap size when I use File instead of InputStream. However, when I try opening the file that is 3,365,793,792 bytes, I get this exception when passing a File object to NPOIFSFileSystem: Exception in thread "main" java.lang.IllegalArgumentException at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:275) at org.apache.poi.poifs.nio.FileBackedDataSource.read(FileBackedDataSource.java:57) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:426) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readBAT(NPOIFSFileSystem.java:402) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readCoreContents(NPOIFSFileSystem.java:377) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:201) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:162) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:143) Could this be because of integer overflow? If I use the InputStream and increase the heap size to 4GB, then maxSize is equal to -926937088 (and _header.getBATCount() = 803), and I get the following exception: Exception in thread "main" java.lang.IllegalArgumentException at java.nio.ByteBuffer.allocate(ByteBuffer.java:330) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:273) at ReadTXRM.main(ReadTXRM.java:19)
Comment 5 Nick Burch 2014-04-23 12:37:16 UTC
Looks like we might have one or more ints that need to be a long. Are you able to generate a file that's >2gb, but data mostly 0s so it can be easily compressed down to something small? That can then be used in unit tests
Comment 6 mskan 2014-04-23 12:55:57 UTC
Unfortunately I don't think that I can create such a file, but I can give you the file that causes the problem (3.37 GB uncompressed and 2.16 GB compressed with bzip2).
Comment 7 Nick Burch 2014-04-24 16:15:59 UTC
Can you try now? For an InputStream, you should now get a helpful IllegalArgumentException on a >2gb file, because ByteBuffer has a 2gb limit For a File, we ought to be able to go bigger. I've switched a couple of ints to longs, can you see if that helps?
Comment 8 mskan 2014-04-25 05:41:59 UTC
Sure, I can try it. Where do I find the new version? Can you send me a jar file?
Comment 9 mskan 2014-04-25 06:03:12 UTC
I found the nightly build (poi-3.11-beta1-20140424). It works! Now I can read large data files using File. Thanks for all your help, I really appreciate it!