Bug 51832

Summary: Workbook not detected as encrypted if WRITEPROTECT record precedes FILEPASS record
Product: POI Reporter: Trejkaz (pen name) <trejkaz>
Component: HSSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal Keywords: PatchAvailable
Priority: P2    
Version: 3.7-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: Proposed patch
Sample file exhibiting the issue

Description Trejkaz (pen name) 2011-09-16 04:32:15 UTC
Created attachment 27505 [details]
Proposed patch

I have about a dozen files here where WRITEPROTECT precedes FILEPASS.  I can't seem to reproduce this myself even if I turn on write protect and encryption for the same workbook, but I am attaching a fix for the issue anyway.

I'm also worried about the general nature of the FILEPASS record - what if it can occur literally _anywhere_?  Is POI's approach of only looking a few records deep actually correct in general?
Comment 1 Nick Burch 2011-09-16 08:18:45 UTC
Any chance you could share one of the files with the records in the other order?

In terms of where the records should live in the file, if you could look at the [MS-XLS] spec and see what it says that'd be great. Possibly much quicker is just to run the Microsofto Binary File Format validator against one of them, and see if that declares the files as valid or not?
Comment 2 Nick Burch 2011-09-16 14:59:01 UTC
Looking on Page 162 of the June 8th 2011 Microsoft docs, it looks like WRITEPROTECT shouldn't come first

BOF, FILEPASS and FILEINFO mustn't be encrypted, however WRITEPROTECT isn't in the list of "must not encrypt" so therefore should be encrypted. I believe that all the un-encrypted records should come first
Comment 3 Trejkaz (pen name) 2011-09-17 08:43:06 UTC
Is there anything in the docs that says that if a record is not documented as "MUST NOT be encrypted", then it MUST be encrypted?  I am having trouble finding it.  The way they have worded it seems pretty open to interpretation to me, and they haven't even used "SHOULD" in the area.

I couldn't find anything specifically stating the order of the records either, except that in OpenOffice's documentation, they seem to lump WriteProtect, FilePass, WriteAccess and FileSharing into a "file protection block" at the front of the file, with no specific notes on which order they should occur in (although WriteProtect is documented first in the list, they don't specifically say whether the documented order matches the stored order for the file protection block.

(The WriteProtect record itself appears to be completely empty - maybe someone assumed that since it didn't have any data in it, it didn't need to be encrypted as there was no point?)

As for test data, I don't have any which I can share.  I have been trying to create it, to no avail.  I'd need to get my hands on the same version which generated it.
Comment 4 Nick Burch 2011-09-17 18:47:50 UTC
The page before says that the whole stream must be encrypted, except for special records (which are documented in the list on p162)
Comment 5 Trejkaz (pen name) 2011-09-19 01:54:15 UTC
Created attachment 27531 [details]
Sample file exhibiting the issue

Good news - I have found some sample data.  Our own functional tests failed when I put in the fix.  I notice that the AppName is "Microsoft Excel" for this file as well.  Given that it's created by a guy who used to work here and appears to be a file exclusively for testing (the email it was inside had a subject to that effect), I have a greater trust in this file than the other ones I had from a client.

Attaching the file.
Comment 6 Trejkaz (pen name) 2011-09-19 02:00:45 UTC
By the way, what section is that text in, so that I can find it in the online version (which of course doesn't use page numbers)?

Section 2.2.10 is what I have been reading:

http://msdn.microsoft.com/en-us/library/dd905723(v=office.12).aspx

It doesn't say it.  And the section immediately before it, 2.2.9, doesn't have anything about encryption in it.
Comment 7 Nick Burch 2011-09-19 10:08:42 UTC
I'm reading section 2.2.10. My copy states that the "Workbook Stream" must be encrypted, with a * note. The note a page later gives the list of the very few records that aren't to be encrypted
Comment 8 Nick Burch 2011-09-19 11:44:40 UTC
I figured a slightly simpler fix, so I've applied it in r1172575 (along with a test) rather than worrying any further about what should or shouldn't be present!