Bug 53475 - Support for more DOCX encryption versions
Summary: Support for more DOCX encryption versions
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: 3.8-FINAL
Hardware: Macintosh All
: P2 major (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2012-06-27 11:35 UTC by Jan Høydahl
Modified: 2013-11-26 23:46 UTC (History)
0 users

Encrypted word doc which crashes POI (33.00 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2012-06-27 11:35 UTC, Jan Høydahl
patch for ignore missing cspname element (23.22 KB, patch)
2013-11-03 20:43 UTC, Andreas Beeker
Details | Diff
encrypted doc - AES-128 with 256 bit key (29.00 KB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
2013-11-06 23:45 UTC, Andreas Beeker
Patch for decrypting AES-192/256 (29.11 KB, patch)
2013-11-10 11:40 UTC, Andreas Beeker
Details | Diff
JCE-check added to tests and AgileDecryptor (10.66 KB, patch)
2013-11-21 00:00 UTC, Andreas Beeker
Details | Diff
patch for encryption support - Part 1 - refactor crypt code (24.93 KB, patch)
2013-11-24 11:34 UTC, Andreas Beeker
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Høydahl 2012-06-27 11:35:15 UTC
Created attachment 29002 [details]
Encrypted word doc which crashes POI


When parsing password protected OOXML Word files, the EncryptionInfo class has explicit support for (versionMajor == 4 && versionMinor == 4 && encryptionFlags == 0x40), while all other versions are treated the same. For some enctypted DOCX documents this causes an exception:

java.lang.RuntimeException: Salt size != 16 !?
	at org.apache.poi.poifs.crypt.EncryptionVerifier.<init>(EncryptionVerifier.java:121)
	at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:66)
	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:211)

Download Apache Tika 1.1 (http://www.apache.org/dyn/closer.cgi/tika/tika-app-1.1.jar) and start it using 
  java -jar tika-app-1.1.jar password-is-solrcell.docx
which triggers the exception. NOTE: Tika does not yet have an option to pass in a password but it crashes before we get to dectyption.

We need to dig into the various versions that a doc can have and what encryption schemes to support. Here is a link to a page explaining the file formats and also providing a .NET program for dectyption (have not had the chance to test it on my example docx file though): http://www.lyquidity.com/devblog/?p=35
Comment 1 Andreas Beeker 2013-11-03 20:43:35 UTC
Created attachment 31004 [details]
patch for ignore missing cspname element

The patch workarounds a missing data element - namely the cspname element in the EncryptionHeader. Although the MS-OFFCRYPTO doesn't mention anything of it being optional, Libre Office and Word Viewer can open the file.

Further encrypted test files with different encryption settings would be nice.
Comment 2 Andreas Beeker 2013-11-06 23:45:45 UTC
Created attachment 31021 [details]
encrypted doc - AES-128 with 256 bit key

This attachment is a Word 2010 encrypted .docx - with customized encryption settings. I have only changed the keysize from 128 to 256 bits, but it might be neccessary to change other encryption settings too. Currently POI can't open this file.

Libre Office 4.0 can't display the file too, but the ms word viewer does.

To test the file simply use the test class of the cspname patch - password is "pass"

To verify the (registry) settings, have a look in the EncryptionVerifier - the xml discriptor says 256 bits 

Just for further reference:
- To change the word encryption settings, you'll need to use the "office administration templates" (just google for your version)
- use the policy editor "gpedit.msc" and add the word adm file under user policies
- change the registry settings over the templates
- see also http://www.dslreports.com/forum/r20210979-Office-2007-Enabling-256bit-AES-encryption
Comment 3 Andreas Beeker 2013-11-06 23:58:21 UTC
I've forgotten to mention, that Word 2010 has two options: a password for read and for edit/write protection. Although both password are the same for the test file, this might result in an additional decryption step ...
Comment 4 Nick Burch 2013-11-07 21:30:28 UTC
Thanks for this, applied (with minor test and comments tweaks) in r1539828.
Comment 5 Andreas Beeker 2013-11-10 11:40:03 UTC
Created attachment 31029 [details]
Patch for decrypting AES-192/256

AES has always a block-size of 128 bits, therefore we need take the keysize of 128, 192 or 256 bits into account.
Part of the patch fixes a wrong usage of the bit-sizes, i.e. the IV has to be calculated by the block size (128 bits) whereas the encrypted key needs to use the key size (e.g. 256 bits).
I'll try to provide a few more test files with other different encryption settings and haven't tested document encryption at all ...

[1] http://msdn.microsoft.com/en-us/library/dd924776(v=office.12).aspx
Comment 6 Nick Burch 2013-11-12 11:38:29 UTC
Thanks for this, applied (with the odd minor tweak) in r1541009.

One thing I did notice is that the code is a little short on JavaDocs in places, and can be a bit short on comments too. If you have a few minutes, while you can still remember the code flow and meaning, it'd be great if you could do a patch to solve that too!
Comment 7 Andreas Beeker 2013-11-21 00:00:13 UTC
Created attachment 31061 [details]
JCE-check added to tests and AgileDecryptor

This patch contains the Assume-Check if the JCE restrictions are in place, which Dominik recommended.

With the Junit3 code, the Assume didn't work, so I needed to convert it to Junit4 annotated code.

Furthermore I've removed some obsolete lines, as the Biff8EncryptionKey is not needed with Agile-Decryption.

I'll add more javadocs when the encryption stuff is finished and also a comment about the JCE policies in the website docs.
Comment 8 Nick Burch 2013-11-21 11:18:30 UTC
Thanks, latest patch (applied with a few minor tweaks to comments/error messages) in r1544121.
Comment 9 Andreas Beeker 2013-11-24 11:34:27 UTC
Created attachment 31073 [details]
patch for encryption support - Part 1 - refactor crypt code

This is part 1 (of presumably 4-5 parts).

As this is a bigger change, I'll post changes as soon as a certain feature compiles/tests stable.

I plan the following parts:
- Part 1: refactor decryption code, so I can use it for encryption
- Part 2: xmlbeans support for encryption descriptor (see details at Part 2)
- Part 3: encryption classes
- Part 4: move en-/decryption code out of main-poi???

As part 2 will break the release, i.e. you'll need xmlbeans for the main-poi, you might want to wait until all parts are out and of course I wouldn't mind a discussion "xmlbeans vs. static xml strings in code"

Currently the patches will be based on the trunk, so part X contains changes of part X-1,... I'll update the diffs, if predecessor parts have been applied
Comment 10 Nick Burch 2013-11-24 18:26:46 UTC
Anything that's to do with xmlbeans needs to live in the poi-ooxml jar, not the main poi one. Possibly that means we need to move some of the encrypt/decrypt code into the poi-ooxml jar

Also, it might make sense to open a new bug for this, rather than using this one, so it's easier to track the new features
Comment 11 Andreas Beeker 2013-11-26 23:46:49 UTC
(In reply to comment #10)
> Also, it might make sense to open a new bug for this, rather than using this
> one, so it's easier to track the new features

I've created the bug entry #55818 and will log my progress there ...