Summary: | Cloning autofilters leads to HUGE files | ||
---|---|---|---|
Product: | POI | Reporter: | Antti Koskimäki <antti.koskimaki> |
Component: | HSSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Windows XP | ||
Attachments: | excel and sourcecode to reproduce situation |
Description
Antti Koskimäki
2008-09-11 05:14:47 UTC
fixed in svn r707450 Junit added. ObjRecord was not reading sub-record ftLbsData properly. ftLbsData does not encode its length in the second ushort field. This field often has a value like 0x1FFEE. Due to another bug in RecordInputStream that allows buffer read-overruns, the ftLbsData was being interpreted as being nearly 8KB instead of 20 bytes. I re-ran your test (cloning 50 sheets from a 22KB input workbook). Here are the resulting file sizes: before fix: 13,101KB after fix: 588KB Your test spreadsheet had 30 (autofilter) combo-boxes, which get cloned 50 times. Assuming 8KB extra per combo-box, this would account for roughly 12MB extra allocation. I didn't attempt the same test (manually) with Excel, but the remaining difference from the 580KB that you got could be explained by POI's expansion of MULRK records. Preserving MULRK, MULBLANK and SHARREDFMLA records is on the to-do list but currently not a high priority. > I re-ran your test (cloning 50 sheets from a 22KB input workbook).
> Here are the resulting file sizes:
> before fix: 13,101KB
> after fix: 588KB
Thanks, verified that this is OK.
But, when I tried to stress-test the fix, I noticed that by adding more cloning-iterations the workbook got corrupted at some point (limit=83, to be exact :=). Workbook without auto-filters seems not to corrupt, not even with 1000 clones.
When I tried to open the workbook I got several "File error, data may have been lost" pop-ups in row, one for each (!) cloned sheet. Besides fatal, also very annoying :=)
Not sure how closely relates to this bug or fix, seems to re-produce with 3.2-FINAL too. But you mentioning buffer-overruns and me using the same test-case to reproduce, plus auto-filters having something to do with it, I decided to re-open this bug instead of reporting new one.
(In reply to comment #2) BTW - the buffer overrun problem in RecordInputStream (that helped keep this bug silent) was fixed in svn r707778 . It would be better if we close off this bugzilla to help track things more clearly. I am pretty sure that the original bug is fixed. The new bug you describe has different characteristics and the fix will probably be unrelated to svn r707450 . Please open a new bugzilla for your second bug. I suspect that this new bug will be of a lower priority since most POI users won't clone anywhere near 83 sheets. However, if you need a fix urgently, you can speed things up by providing a patch or at least identifying the exact cause of the bug. |