Summary: | Workbook not detected as encrypted if WRITEPROTECT record precedes FILEPASS record | ||
---|---|---|---|
Product: | POI | Reporter: | Trejkaz (pen name) <trejkaz> |
Component: | HSSF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | Keywords: | PatchAvailable |
Priority: | P2 | ||
Version: | 3.7-FINAL | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: |
Proposed patch
Sample file exhibiting the issue |
Any chance you could share one of the files with the records in the other order? In terms of where the records should live in the file, if you could look at the [MS-XLS] spec and see what it says that'd be great. Possibly much quicker is just to run the Microsofto Binary File Format validator against one of them, and see if that declares the files as valid or not? Looking on Page 162 of the June 8th 2011 Microsoft docs, it looks like WRITEPROTECT shouldn't come first BOF, FILEPASS and FILEINFO mustn't be encrypted, however WRITEPROTECT isn't in the list of "must not encrypt" so therefore should be encrypted. I believe that all the un-encrypted records should come first Is there anything in the docs that says that if a record is not documented as "MUST NOT be encrypted", then it MUST be encrypted? I am having trouble finding it. The way they have worded it seems pretty open to interpretation to me, and they haven't even used "SHOULD" in the area. I couldn't find anything specifically stating the order of the records either, except that in OpenOffice's documentation, they seem to lump WriteProtect, FilePass, WriteAccess and FileSharing into a "file protection block" at the front of the file, with no specific notes on which order they should occur in (although WriteProtect is documented first in the list, they don't specifically say whether the documented order matches the stored order for the file protection block. (The WriteProtect record itself appears to be completely empty - maybe someone assumed that since it didn't have any data in it, it didn't need to be encrypted as there was no point?) As for test data, I don't have any which I can share. I have been trying to create it, to no avail. I'd need to get my hands on the same version which generated it. The page before says that the whole stream must be encrypted, except for special records (which are documented in the list on p162) Created attachment 27531 [details]
Sample file exhibiting the issue
Good news - I have found some sample data. Our own functional tests failed when I put in the fix. I notice that the AppName is "Microsoft Excel" for this file as well. Given that it's created by a guy who used to work here and appears to be a file exclusively for testing (the email it was inside had a subject to that effect), I have a greater trust in this file than the other ones I had from a client.
Attaching the file.
By the way, what section is that text in, so that I can find it in the online version (which of course doesn't use page numbers)? Section 2.2.10 is what I have been reading: http://msdn.microsoft.com/en-us/library/dd905723(v=office.12).aspx It doesn't say it. And the section immediately before it, 2.2.9, doesn't have anything about encryption in it. I'm reading section 2.2.10. My copy states that the "Workbook Stream" must be encrypted, with a * note. The note a page later gives the list of the very few records that aren't to be encrypted |
Created attachment 27505 [details] Proposed patch I have about a dozen files here where WRITEPROTECT precedes FILEPASS. I can't seem to reproduce this myself even if I turn on write protect and encryption for the same workbook, but I am attaching a fix for the issue anyway. I'm also worried about the general nature of the FILEPASS record - what if it can occur literally _anywhere_? Is POI's approach of only looking a few records deep actually correct in general?