Apache OpenOffice (AOO) Bugzilla – Issue 123429
Calc is unacceptably slow in opening the attached XLSX file (Cornell)
Last modified: 2014-04-18 07:34:55 UTC
The XLSX file available at http://www.birds.cornell.edu/clementschecklist/download/ takes extremely long to open with OpenOffice 4.0.1. Some users report that OpenOffice seems to freeze, Rob and me managed to open the file but we had to wait about 15 minutes with the progress bar stuck at about 25% (and no particular CPU/RAM usage). I'll make a copy available for reference. 3.4.1 has a similar behavior. The issue is discussed here: http://comments.gmane.org/gmane.comp.apache.openoffice.user/2246
A copy of the document (just in case the original one is replaced) is available at http://people.apache.org/~pescetti/tmp/2013-10-i123429/
This "spreadsheet" as a first sheet with 107,817 format ranges in 75 unique format ranges. After removing the formatting overkill the file loads at acceptable speed. Nothing to bother about.
ALG: Checked, indeed it loads after a loooong time. This would need to be measured to see if (and what) could be optimized and where the time is spent.
ALG: Looks as if the 1st part of the load the 69248 strings get read; the long part after this is actually adding them to the sheet. There seems to be a lot of processing involved, to get from 'RichStrings' to cell contents. I do not know much about it (yet)...
Created attachment 81800 [details] another testcase 1 sheet, 9 rows x 11 cols: opening with excel 2010 takes 0.3 sec., while OO takes ~10min.
The document does not only contain the rows with real content, but thousands of rows of the kind <row r="15" spans="1:11" x14ac:dyDescent="0.25" ><c r="A15" s="4" /></row>
Following Microsoft's tip in http://msdn.microsoft.com/en-us/library/ff726673%28v=office.14%29.aspx#xlMinUsedRange I can see, that the last used cell is K65536 in the attached document.
Comment on attachment 81800 [details] another testcase I moved the "BG8" file and related discussion (comments #5-#6-#7) to https://issues.apache.org/ooo/show_bug.cgi?id=123919 since it's unclear whether the root cause is the same. Let's keep this issue open for the "Cornell" file only.
Note that 3.3.0 opens the "Cornell" file (Clements Checklist) in <30 seconds, which is acceptable considering the size, and much different than the 15+ minutes needed on 4.x. Regression.
Additional Info: ---------------- (a) time for opening the document in WIN 7 (a1) Gnumeric 1.12.9: 5s (a2) LibO 4.2: 4s (a3) MS Excel Viewer: 4s (a4) Kingsoft: 4s (a5) Softmaker FreeOffice: 5s (b) Already a problem with AOO 3.4.0, I killed the process after 4 minutes