Apache OpenOffice (AOO) Bugzilla – Full Text Issue Listing |
Summary: | SHA1/MD5 in sal/rtl/source/digest.c give invalid results for some inputs (ODF interop issues) | ||
---|---|---|---|
Product: | General | Reporter: | JimF <jfoug> |
Component: | security | Assignee: | AOO security list <security> |
Status: | CONFIRMED --- | QA Contact: | |
Severity: | Major | ||
Priority: | P2 | CC: | damjan, jfoug, mseidel, oooforum, petko |
Version: | 3.3.0 or older (OOo) | Flags: | mseidel:
4.2.0_release_blocker?
|
Target Milestone: | --- | ||
Hardware: | All | ||
OS: | All | ||
Issue Type: | DEFECT | Latest Confirmation in: | 4.2.0-dev |
Developer Difficulty: | --- |
Description
JimF
2018-01-15 22:17:52 UTC
>The bug has been reported to Libre Office, (same source base), and is being
>actively handled there.
Well, maybe you could ask to the developer to publish his patch under AL v.2. We'll happy to commit it into AOO trunk.
I am not part of either the AOO group (in any way) or in the Libra office group. I simply found this bug while building other tools to process the ODF data files, doing data recovery. There were files created by both suites which were NOT correctly finding known passwords using FIPS compliant SHA1 algorithm. I simply investigated the raw source, found the problem, was fully able to recreate the problem, AND to recreate the broken values found in the file (thus matching the hash), by adding an extra 64 bytes of NULL data to the data being hashed. I simply brought the bug forward to both Apache and to Libre. So far, Libra has taken steps to correct their SHA1 and MD5. I believe they may have removed the MD5 code, since it was deprecated and unused. For the SHA1, they corrected the hash, BUT kept the older buggy code. Upon import of ODF file, if the correct hash did not correctly work for the user password, the buggy code was used. If that buggy code produced a correct match, then the ODF file was allowed to be properly opened, BUT when re-saved, the ODF file was done so using proper SHA1, so that the non-compliant saved file was not carried forward. NOTE, Libre actually HAD test vectors which they had commented out (for longer hashes), due to those test vectors failing. However, once the bug was fixed as I recommended, those test vectors worked properly. The test vectors were FIPS compliant, so with the buggy code, those test vectors were failing. Now they can be included due to proper functionality and the hashes produced all now match. How the Apache group chooses to deal with this bug is beyond the scope of my involvement. It certainly is not the worst issue ever. A ODF file saved by AOO which triggers the bug will be validated and opened by AOO with the buggy SHA1. But it will not be usable in any other suite. I am simply the messenger, finding and documenting the bug, and trying my best to document how to do the 'raw' fix, and also posting warnings about fixing this and then having users already saved files (in the non-compliant format) NOT being able to be opened properly using only the new correct and compliant code. Jim. (In reply to JimF from comment #2) > I am simply the messenger, finding and documenting the bug Well, AOO project is only composed of volunteers And a messenger can involved and become one of them. I have asked on LO dev List and received no answer since. https://lists.freedesktop.org/archives/libreoffice/2019-July/083191.html The discussion on OpenOffice dev List surfaced a nice alternative by using the code from APR. This is maybe more effort but is in general a more interesting direction to follow. https://apr.apache.org/ Confirming, this can be reproduced by fiddling with the unit test in sal/qa/rtl/digest/rtl_digest.cxx. Since 52 <= length(data)%64 <= 55 gives wrong hash results, and length(data) % 64 gives a value between 0 and 63, this is 64 possible values. The broken range of these values, 52-55, gives 55-52+1 = 4 possible values, and 4 out of 64 = 6.25% of all files written with the previous wrong hash, will be unreadable if this change is made. That's many files that we would reject, just because they were written with an older buggy version. We really should do something to check for the bad hash and open them anyway. But we also need to stop writing these invalid files - as soon as possible. I remember users complaining about OpenOffice not being able to open their password protected files anymore. Maybe these files were opened/saved with LO and then AOO couldn't verify the (new) correct hash. For the user this appears as "data loss". I agree that this should be fixed ASAP. In main/sal/rtl/source/digest.c these are the buggy functions: static void __rtl_digest_endMD5 (DigestContextMD5 *ctx) static void __rtl_digest_endSHA (DigestContextSHA *ctx) As per OpenGrok, the tree of their callers is as follows: __rtl_digest_endMD5() is called by: rtl_digest_getMD5() which is called by: 1. sal's digest unit tests in main/sal/qa/rtl/digest/rtl_digest.cxx. 2. main/vcl/source/gdi/pdfwriter_impl2.cxx PDFWriterImpl::computeEncryptionKey() PDFWriterImpl::computeODictionaryValue() PDFWriterImpl::computeUDictionaryValue() when writing PDFs, which are really an export-only format, so I don't think we need to patch anything here. 3. trunk/main/vcl/source/gdi/pdfwriter_impl.cxx PDFWriterImpl::computeDocumentIdentifier() PDFWriterImpl::emitTrailer() again, PDF is an export-only format, some files we previously wrote would be corrupt to others, but when this is fixed, they won't be. 4. main/sal/rtl/source/uuid.cxx rtl_createNamedUuid() which is called by a unit test in main/sal/qa/rtl/uuid/rtl_Uuid.cxx, but never called by any other code. 5. main/sdext/source/pdfimport/pdfparse/pdfentries.cxx PDFFile::decrypt() password_to_key() check_user_password() ok so maybe we import PDFs too... That means some encrypted PDFs currently fail to open, but will be fixed. Should we be opening corrupt encrypted PDFs that earlier versions wrote though? 6. main/tools/bootstrp/md5.cxx calc_md5_checksum() +-main() in main/tools/bootstrp/so_checksum.cxx, which builds the so_checksum binary, which checksums command line arguments and prints out their MD5, and never appears to be used during the build. 7. main/sdext/source/pdfimport/filterdet.cxx checkDocChecksum() +-getAdditionalStream() +-PDFDetector::detect() +-PDFIHybridAdaptor::filter() 8. main/sal/osl/unx/signal.c calc_md5_checksum() +-ReportCrash() 9. main/crashrep/source/win32/soreport.cpp calc_md5_checksum() which is unused, it's in an "#if 0" block. So far it looks to me like for MD5 we don't need compatibility with previously written files with badly calculated checksums. Let's see SHA next. No wait, there's another file of rtl_digest_getMD5() callers that I missed: rtl_digest_getMD5() +-1 __rtl_digest_MD5 | | struct Direct_Impl's m_get | +-1 rtl_digest_get() | +-1 main/oox/source/core/filterdetect.cxx | | only for SHA1 | +-2 main/sal/rtl/source/random.c | | uses RTL_RANDOM_DIGEST, irrelevant to us. | +-3 OConnectionWrapper::createUniqueId() | | uses SHA1, not MD5 | +-4 OfficePipeId::operator() | | really uses MD5, hashes installation path for IPC ID. | +-5 CreateMD5FromString() | +-1 OfficeIPCThread::EnableOfficeIPCThread() | Also hashes installation path for IPC ID. +-2 rtl_digest_MD5() | +-1 PDFWriterImpl::checkAndEnableStreamEncryption() | +-2 PDFWriterImpl::enableStringEncryption() | +-3 prepareIV() in main/svl/source/passwordcontainer/passwordcontainer.cxx | | +- PasswordContainer::DecodePasswords() | | +- PasswordContainer::EncodePasswords() | +-4 calc_md5_checksum() in main/crashrep/source/win32/soreport.cpp | +- WriteChecksumFile() | Used in the crash report. +-3 rtl_digest_initHMAC_MD5() | +-1 __rtl_digest_HMAC_MD5 | | | struct Direct_Impl's m_init | | +-1 rtl_digest_init() | | +- included in all the places seen for 1.1 | +-2 rtl_digest_HMAC_MD5() | only used in a test +-4 rtl_digest_getHMAC_MD5() +-1 rtl_digest_HMAC_MD5() | same as 3.2 +-2 __rtl_digest_HMAC_MD5 struct Direct_Impl's m_get same as 1 The only one of potential concern is the PasswordContainer, but that only appears to be used for storing website passwords. The MD5 hash is used for the initialization vector for encryption and decryption, so the changed MD5 function might mean users can't decrypt their stored (website?) passwords and have to re-enter them all. Others are of far lesser concern, eg. IIRC the office IPC ID is opaque anyway, PDF writing was already discussed, and crash reports are currently unused. Actually wait, we have more MD5 users: (struct) Digest_Impl __rtl_digest_MD5 +-2 rtl_digest_createMD5() +-1 all the places we saw already +-2 rtl_digest_create() with rtl_Digest_AlgorithmMD5 +-1 BinaryCodec_RCF() in main/oox/source/core/binarycodec.cxx +-2 MSCodec_Std97::MSCodec_Std97() | in main/filter/source/msfilter/mscodec.cxx +-3 GenerateStd97Key() in main/comphelper/source/misc/docpasswordhelper.cxx +-4 OfficePipeId::operator () () | already seen, office IPC ID +-5 CreateMD5FromString() already seen, office IPC ID 2.2.1, 2.2.2 and 2.2.3 deal with Microsoft Office documents, so it's not just ODF that's affected... Actually the Microsoft Office document code (GenerateStd97Key, BinaryCodec_RCF, etc.) all use rtl_digest_rawMD5() instead of rtl_digest_endMD5(), which doesn't have this bug. So it's only PasswordContainer, and I have patch that can read files written with the bad MD5 sum and convert them. Now onto __rtl_digest_endSHA(). __rtl_digest_endSHA() appears to be used by both SHA and SHA1 hash functions. Thankfully, other than in its own unit tests, SHA is completely unused in our code. SHA-1 on the other hand, is impossible, I am giving up and trying an automated tool to generate call graphs (anybody know any?), there is simply too many to draw and go through by hand. Here is what I have so far, finished for SHA but very incomplete for SHA-1: __rtl_digest_endSHA() | +-1 rtl_digest_getSHA() | +-1 __rtl_digest_SHA_0 | | +-1 rtl_digest_SHA() | | | only used in tests | | +-2 rtl_digest_createSHA() | | +-1 rtl_digest_create() with rtl_Digest_AlgorithmSHA | | only used in tests | +-2 rtl_digest_SHA() | see 1.1.1 | | +-2 rtl_digest_getSHA1() +-1 SHA1DigestContext::finalizeDigestAndDispose() | | in main/package/source/package/zipapi/sha1context.cxx | | | +-1 ZipOutputStream::closeEntry() | | | in main/package/source/package/zipapi/ZipOutputStream.cxx | | | deals with WRITING, and should always be writing the CORRECT SHA-1: | | +-1 ZipPackageFolder::saveChild() | | +-2 ZipPackage::WriteMimetypeMagicFile() | | +-3 ZipPackage::WriteManifest() | | +-4 ZipPackage::WriteContentTypes() | | | +-2 ZipFile::StaticGetDigestContextForChecksum() | +-1 ZipFile::StaticHasValidPassword() | | | used when READING, needs to deal with bad SHA-1... | | +-1 ZipFile::StaticGetDataFromRawStream() | | +-2 ZipFile::hasValidPassword() | | +-1 ZipFile::getInputStream() | | +-2 ZipFile::getDataStream() | | | +-2 ZipOutputStream::putNextEntry() | used when WRITING, and should always be writing the CORRECT SHA-1. | +-2 __rtl_digest_SHA_1 | | struct Direct_Impl's m_get | +-1 rtl_digest_get() | +-1 lclDeriveKey() in main/oox/source/core/filterdetect.cxx | | +-1 lclGenerateEncryptionKey() | | see 2.2.1.3 | +-2 lclCheckEncryptionData() | | +-1 PasswordVerifier::verifyEncryptionData() | | | +-1 FilterDetect::extractUnencryptedPackage() | | | | see 2.2.1.3.1.1.3 | | | +-2 DocPasswordHelper::requestAndVerifyDocPassword() | | | see 2.2.1.3.1.1 | | +-2 lclGenerateEncryptionKey() | | see 2.2.1.3 | +-3 lclGenerateEncryptionKey() | | +-1 PasswordVerifier::verifyPassword() | | +-1 DocPasswordHelper::requestAndVerifyDocPassword() | | | in main/comphelper/source/misc/docpasswordhelper.cxx | | +-1 ScfApiHelper::QueryEncryptionDataForMedium() | | | | in main/sc/source/filter/ftools/fapihelper.cxx | | | +-1 XclRoot::RequestEncryptionData() | | | | in main/sc/source/filter/excel/xlroot.cxx | | | +-1 XclImpDecryptHelper::ReadFilepass() | | | +-1 ImportExcel::Read() | | | used when READING .xls files | | | | | +-2 FilterBase::requestEncryptionData() | | | in oox/source/core/filterbase.cxx and used when READING | | +-3 FilterDetect::extractUnencryptedPackage() | | | used when READING ooxml | | +-4 SharedConfigData::requestEncryptionData() | | | in main/oox/source/dump/dumperbase.cxx | | | and used when READING ooxml | | +-5 CheckPasswd_Impl() | | | in main/sfx2/source/appl/appopen.cxx | | +-1 DocumentInserter::CreateMedium() | | | ... | | +-2 SfxObjectShell::LoadOwnFormat() | +-4 OConnectionWrapper::createUniqueId() | used as a key for connection pooling, doesn't persist to disk. | +-3 rtl_digest_SHA1() +-4 rtl_digest_initHMAC_SHA1() +-5 rtl_digest_getHMAC_SHA1() |