Bug 46448 - Performance problem while reading a Excel 2007 document (compare to Excel 2003 document)
Summary: Performance problem while reading a Excel 2007 document (compare to Excel 200...
Alias: None
Product: POI
Classification: Unclassified
Component: XSSF (show other bugs)
Version: 3.5-dev
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2008-12-30 00:52 UTC by Matthew
Modified: 2008-12-30 02:04 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Matthew 2008-12-30 00:52:45 UTC
I am using POI 3.5 Beta:

- while opening an Excel 2003 document it took about 700 milliseconds. 
- while opening an Excel 2007 document it took 3000 milliseconds.
(It is 4 times slower)

Would you help on solving this issue? And please let me know if you have any other advice which will help to resolve this issue. Thanks a lot!
Comment 1 Matthew 2008-12-30 01:46:20 UTC
Opening an Excel 2003 document:

FileInputStream fis = new FileInputStream(path);
POIFSFileSystem fs = new POIFSFileSystem(fis);
Workbook workbook = new HSSFWorkbook(fs);

Opening an Excel 2007 document:

Workbook workbook = new XSSFWorkbook(path);
Comment 2 Yegor Kozlov 2008-12-30 02:04:38 UTC
Opening *.xlsx files will always be slower that opening *.xls just because parsing XML is always slower than reading binary data. 

We use the XMLBeans technology (http://xmlbeans.apache.org/) to map XML to Java and takes quite some time to process OOXML documents.

A possible workaround is to use XSSF Event API: http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api. It requires basic understanding of the file structure but allows processing *.xlsx with low memory footprint and much faster than using XSSF usermodel API.