Bug 46448

Summary: Performance problem while reading a Excel 2007 document (compare to Excel 2003 document)
Product: POI Reporter: Matthew <matthew.knl>
Component: XSSFAssignee: POI Developers List <dev>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: P2    
Version: 3.5-dev   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   

Description Matthew 2008-12-30 00:52:45 UTC
I am using POI 3.5 Beta:

- while opening an Excel 2003 document it took about 700 milliseconds. 
- while opening an Excel 2007 document it took 3000 milliseconds.
(It is 4 times slower)

Would you help on solving this issue? And please let me know if you have any other advice which will help to resolve this issue. Thanks a lot!
Comment 1 Matthew 2008-12-30 01:46:20 UTC
Opening an Excel 2003 document:

FileInputStream fis = new FileInputStream(path);
POIFSFileSystem fs = new POIFSFileSystem(fis);
Workbook workbook = new HSSFWorkbook(fs);

--
Opening an Excel 2007 document:

Workbook workbook = new XSSFWorkbook(path);
Comment 2 Yegor Kozlov 2008-12-30 02:04:38 UTC
Opening *.xlsx files will always be slower that opening *.xls just because parsing XML is always slower than reading binary data. 

We use the XMLBeans technology (http://xmlbeans.apache.org/) to map XML to Java and takes quite some time to process OOXML documents.

A possible workaround is to use XSSF Event API: http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api. It requires basic understanding of the file structure but allows processing *.xlsx with low memory footprint and much faster than using XSSF usermodel API.


Yegor