Summary: | Regression: Cannot remove block | ||
---|---|---|---|
Product: | POI | Reporter: | Trejkaz (pen name) <trejkaz> |
Component: | POIFS | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | major | CC: | sackley |
Priority: | P2 | ||
Version: | 3.0-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Windows Vista | ||
Attachments: |
Test files (zipped)
Proposed fix Proposed fix v2 |
Description
Trejkaz (pen name)
2008-04-28 16:02:33 UTC
Created attachment 21876 [details]
Test files (zipped)
Attaching test files in a zip (was too big to upload raw.)
Looks similar to bug 28239. Is this a POIFS block size problem? I saw that other bug, but I thought it might be a different issue as that bug has been around since 2004, yet this one only started in version 3.0.2. Did something about the way blocks were loaded change in 3.0.2? I've had a bit of a look myself by diffing the poifs package between 3.0.1 and 3.0.2, and the only change I can see which might cause the issue is: public RawDataBlock(final InputStream stream, int blockSize) + // IOUtils.readFully will always read the + // requested number of bytes, unless it hits + // an EOF + _eof = true; This is in the code which logs a warning instead of throwing an error, if the final block isn't a multiple of 512... I do recall that being fixed in 3.0.2 but it's possible the way it was fixed causes an error in some files. Created attachment 21882 [details]
Proposed fix
It is as I thought, removing that one line makes all our files parse correctly.
...I put it on 28239 too, purely by accident. I don't know if it fixes that bug yet as I haven't tested it.
Before and after my fix, TestRawDataBlock.testShortConstructor is failing. It seems to be expecting an IOException, which has been removed since 3.0.2. Seems someone "fixed" testEmptyConstructor by removing the fail() call but not testShortConstructor. But perhaps catching IOException in the unit test is also a bad idea? Created attachment 21883 [details]
Proposed fix v2
Here's one which fixes the failures in TestRawDataBlock and TestRawDataBlockList.
I don't like the idea of just hiding the EOF Maybe the right solution is that if we have a short last block, we detect this, then pad it out to the required 512 byte multiple? It isn't really "hiding" the EOF, it's simply moving it to where the file is actually EOF. Previously it would be: * 512 byte block * ... * 512 byte block * short block (claims to be EOF) Whatever reads it from the outside said "oh, EOF... so discard the entire block." Now it's: * 512 byte block * ... * 512 byte block * short block * EOF I'm not really sure what's wrong with representing it how it really is. But if you want to pad the file, go ahead... not padding it doesn't seem to have any drawback either since if it's a short block then the document entry for it won't result in reading past the end of the data which is actually there. Fixed in SVN trunk. We now have a difference between "stream EoF reached" and "I have data", so things work correctly now *** Bug 28239 has been marked as a duplicate of this bug. *** (In reply to comment #9) > Fixed in SVN trunk. svn r653945 added a unit test and improved the error message. This change made it into 3.1-beta2. svn r658285 completed the fix for this bug and was available in 3.1-final |