Apache OpenOffice (AOO) Bugzilla – Issue 128579
32-bit editengine regression causes a crash when opening PPT files
Last modified: 2024-01-03 15:41:07 UTC
Somewhere in the 32 commits between c7ace38fedbe61bc12c11cf4f428626429620f06 and 36a464621660bc9c55b210922d14dc3735a9077e there was a regression that caused opening PPT files to crash AOO. This regression was introduced as part of the fix for bug 57176. It is difficult to narrow down further, as not all the commits in that range compile. I first found this bug while investigating another PPT crash issue, bug 127783, but that one is Windows-only, while this one affects at least FreeBSD, and probably all systems. The sample file on bug 127783 can be used to reproduce the crash.
Adding keywords. Should be a 4.2.0 blocker. Since the crash seemed to involve memory corruption, maybe building main/editeng with address sanitation ("-fsanitize=address") will reveal where the memory corruption starts.
So I tried building and testing various commits in that range of 32 commits that caused this regression: Commit Builds? Works? 36a464621660bc9c55b210922d14dc3735a9077e 9ec87cd3b34edb40cfe3e776689fb643e6988567 8ee1071cd11ad3f8718884f8c00e3b804d0af3b1 2cdd1ab0954d1a9988859888d6f6959a67d65693 c2eaa082a297a157b60dd87990a41542135872c8 9d9bf804fbab7089311c26afdf81e1d8d28d853b 0f74efc0ec8e3d3afd0cecccb02fbb5c870cb9f2 5707f3c20a9e1a1c755e034eba60a84635e66cc1 FALSE 79496c0e51bec39643863c32709ffb17c141f62f 7a5aa0144b33f34e6026854c21acbf8d7489983a ced59291f7a8e2ed44ad73b26bcdef75827bc7b2 86af277237f2917ff5a5d13dc3647298c7d998f6 fbcf52f4d5facbb0884558ee718f6746df6bc62f 87936a440797b7a985fef974d60f4a48c39b855c 961599b3ba1767fc0f5ca79ba244c90113c3aa24 8f081814dede3b5e8825d4cd6e5c9c363b19f21e 67866f572d8a86e153be747862ec69d5c4477d82 FALSE d8fd122573a3a07dedb0b01271fbbe9e47077b7b FALSE 3974c0663f06bb0a4e51c69a5d49c61022b8fd5b 209925efc4daf4fbf9be8d0ac9cf3fe7fc360b24 1e6c8db2b969e0b438d749b8ed79f6a08fe75d2c 434f02cad7370eb311370dc8dcc01b377a85f9e8 FALSE bb99baf43803e8317e9f40ec033cd2f7a36b87e9 FALSE 597c409aade8823e17f55ff8711dc896ad293690 FALSE 20e487bda952ee879baf3ee086928fd08706c564 FALSE fffd1d5de18ed84b4db5b15a57540d804700e38c FALSE 3155dbc2d6387a8f19e0cf04cd12c3d4501eaef4 FALSE a38ef00207b4524282a09fe5cd78defe45caf613 FALSE 8dc46c982409f2e0e073dd55d6b2e4241173c266 TRUE FALSE 3228dc3342001440d11c7fb59e0ed697f17dc10f TRUE FALSE a890f778ee3da0f9a88d833ee3aa73bbd9d6406e TRUE FALSE 765ae79b138b5cc2543282200466b2ef4be7be8e TRUE TRUE Most don't build, but by some miracle, out of the first 4 that did, it was definitely commit a890f778ee3da0f9a88d833ee3aa73bbd9d6406e that caused this regression. Probably the changed return values from ParaPortionList::FindParagraph() (0xFFFF to SAL_MAX_UINT32) or FastGetPos() (USHRT_MAX to SAL_MAX_UINT32) are not checked properly by some caller, resulting in it not detecting the element does not exist, and then trying to access an invalid element (possibly NULL?) leads to the crash.
The bug has to do with how integer overflow behaves with different integer type sizes. Patching main/editeng/source/editeng/editdoc2.cxx to print out its parameters and the computer "nEnd" value shows that, when we have: static sal_uInt16 FastGetPos( const VoidPtr *pPtrArray, sal_uInt16 nPtrArrayLen, VoidPtr pPtr, sal_uInt16 &rLastPos ) this is printed: FastGetPos(0x899aca8d0, 1, 0x899c65710, 22) nEnd = 1 but when we have: static sal_uInt16 FastGetPos( const VoidPtr *pPtrArray, sal_uInt32 nPtrArrayLen, VoidPtr pPtr, sal_uInt16 &rLastPos ) this is printed instead: FastGetPos(0x8951ef290, 1, 0x8951d6bf0, 22) nEnd = 24 Why? Because in FastGetPos() we do this: ---snip--- static sal_uInt16 FastGetPos( const VoidPtr *pPtrArray, sal_uInt32 nPtrArrayLen, VoidPtr pPtr, sal_uInt16 &rLastPos ) { // Through certain filter code-paths we do a lot of appends, which in // turn call GetPos - creating some N^2 nightmares. If we have a // non-trivially large list, do a few checks from the end first. if( rLastPos > 16 ) { sal_uInt16 nEnd; if (rLastPos > nPtrArrayLen - 2) nEnd = nPtrArrayLen; else nEnd = rLastPos + 2; for( sal_uInt16 nIdx = rLastPos - 2; nIdx < nEnd; nIdx++ ) { if( pPtrArray[ nIdx ] == pPtr ) { rLastPos = nIdx; return nIdx; } } } // The world's lamest linear search from svarray ... for( sal_uInt16 nIdx = 0; nIdx < nPtrArrayLen; nIdx++ ) if (pPtrArray[ nIdx ] == pPtr ) return rLastPos = nIdx; return USHRT_MAX; } ---snip--- In both those tests, nPtrArrayLen was 1, but in the 16 bit nPtrArrayLen case, nEnd == 1, which means the "if (rLastPos > nPtrArrayLen - 2)" was true, whereas in the 32 bit nPtrArrayLen case, nEnd == 24, which means that "if" was false. Even in that "for" loop later, rLastPos - 2 could wrap around to the maximum integer value, if rLastPos is 0 or 1. And this was a problem even in the old code! That whole FastGetPos() function sucks and should be rewritten.
This patch seems to stop the crash when applied to the original regression commit (a890f778ee). Let's see whether it works on the latest trunk too. ---snip--- diff --git a/main/editeng/source/editeng/editdoc2.cxx b/main/editeng/source/editeng/editdoc2.cxx index 327872b474..604a658a7b 100644 --- a/main/editeng/source/editeng/editdoc2.cxx +++ b/main/editeng/source/editeng/editdoc2.cxx @@ -330,7 +330,7 @@ static sal_uInt32 FastGetPos( const VoidPtr *pPtrArray, sal_uInt32 nPtrArrayLen, if( rLastPos > 16 ) { sal_uInt32 nEnd; - if (rLastPos > nPtrArrayLen - 2) + if (rLastPos + 2 > nPtrArrayLen) nEnd = nPtrArrayLen; else nEnd = rLastPos + 2; ---snip---
It fixes the bug on trunk too. Patch committed in commit 294ee1400602b0a60dabfaa12034b85c1df06f8f. Resolving FIXED.
Cherry-picked for the AOO42X branch too, in commit 93440cfe8c5159735f8bce693fbafd7479717510.
So, in the end, this was another case of an "optimization" that was broken and ready to "explode"... We should remember to look for words like "optimize", "accelerate" etc. when we are trying to fix bugs. ;-) Great job!
Thank you!