Bug 52203 - java.lang.OutOfMemoryError: Java heap space when reading .xlsx file with 1million rows by using usermodel api
Summary: java.lang.OutOfMemoryError: Java heap space when reading .xlsx file with 1mil...
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.7-FINAL
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords: APIBug
Depends on:
Blocks:
 
Reported: 2011-11-17 09:27 UTC by ronwhite
Modified: 2011-11-17 11:34 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ronwhite 2011-11-17 09:27:28 UTC
I try to read .xlsx file of 1 million rows by using usermodel api, the file size is 11.2 MB, the jvm throw Exception :
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:2786)
	at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
	at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource$FakeZipEntry.<init>(ZipInputStreamZipEntrySource.java:115)
	at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:55)
	at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:82)
	at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:220)
	at SXSSFWorkbookTest.test2(SXSSFWorkbookTest.java:51)
	at SXSSFWorkbookTest.main(SXSSFWorkbookTest.java:37)

To resolve the issue, i use the eventmodel api, that is ok!.
But i want to know why the usermodel api take so much memory when parseing huge rows file???
Comment 1 ronwhite 2011-11-17 09:34:43 UTC
(In reply to comment #0)
> I try to read .xlsx file of 1 million rows by using usermodel api, the file
> size is 11.2 MB, the jvm throw Exception :
code:
                InputStream inp = new FileInputStream("100w.xlsx");
		Workbook wb = WorkbookFactory.create(inp);
exception:
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>     at java.util.Arrays.copyOf(Arrays.java:2786)
>     at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
>     at
> org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource$FakeZipEntry.<init>(ZipInputStreamZipEntrySource.java:115)
>     at
> org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:55)
>     at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:82)
>     at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:220)
>     at SXSSFWorkbookTest.test2(SXSSFWorkbookTest.java:51)
>     at SXSSFWorkbookTest.main(SXSSFWorkbookTest.java:37)
> To resolve the issue, i use the eventmodel api, that is ok!.
> But i want to know why the usermodel api take so much memory when parseing huge
> rows file???
Comment 2 Nick Burch 2011-11-17 11:34:05 UTC
As discussed many times on the list, the usermodel loads everything into memory, so you need lots of memory available to hold everything. The event model just does one little bit at a time, so is much lower memory footprint (but you can't do random access)