Issue 128579 - 32-bit editengine regression causes a crash when opening PPT files
Summary: 32-bit editengine regression causes a crash when opening PPT files
Status: RESOLVED FIXED
Alias: None
Product: Impress
Classification: Application
Component: open-import (show other issues)
Version: 4.2.0-dev
Hardware: All All
: P5 (lowest) Normal (vote)
Target Milestone: 4.2.0
Assignee: AOO issues mailing list
QA Contact:
URL:
Keywords: crash, regression
Depends on:
Blocks:
 
Reported: 2023-10-05 16:20 UTC by damjan
Modified: 2024-01-03 15:41 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: 4.2.0-dev
Developer Difficulty: ---
damjan: 4.2.0_release_blocker?


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description damjan 2023-10-05 16:20:07 UTC
Somewhere in the 32 commits between c7ace38fedbe61bc12c11cf4f428626429620f06 and 36a464621660bc9c55b210922d14dc3735a9077e there was a regression that caused opening PPT files to crash AOO.

This regression was introduced as part of the fix for bug 57176. It is difficult to narrow down further, as not all the commits in that range compile.

I first found this bug while investigating another PPT crash issue, bug 127783, but that one is Windows-only, while this one affects at least FreeBSD, and probably all systems.

The sample file on bug 127783 can be used to reproduce the crash.
Comment 1 damjan 2023-10-05 16:23:26 UTC
Adding keywords. Should be a 4.2.0 blocker.

Since the crash seemed to involve memory corruption, maybe building main/editeng  with address sanitation ("-fsanitize=address") will reveal where the memory corruption starts.
Comment 2 damjan 2023-12-31 18:05:40 UTC
So I tried building and testing various commits in that range of 32 commits that caused this regression:

Commit						Builds?	 Works?
36a464621660bc9c55b210922d14dc3735a9077e		
9ec87cd3b34edb40cfe3e776689fb643e6988567		
8ee1071cd11ad3f8718884f8c00e3b804d0af3b1		
2cdd1ab0954d1a9988859888d6f6959a67d65693		
c2eaa082a297a157b60dd87990a41542135872c8		
9d9bf804fbab7089311c26afdf81e1d8d28d853b		
0f74efc0ec8e3d3afd0cecccb02fbb5c870cb9f2		
5707f3c20a9e1a1c755e034eba60a84635e66cc1	FALSE	
79496c0e51bec39643863c32709ffb17c141f62f		
7a5aa0144b33f34e6026854c21acbf8d7489983a		
ced59291f7a8e2ed44ad73b26bcdef75827bc7b2		
86af277237f2917ff5a5d13dc3647298c7d998f6		
fbcf52f4d5facbb0884558ee718f6746df6bc62f		
87936a440797b7a985fef974d60f4a48c39b855c		
961599b3ba1767fc0f5ca79ba244c90113c3aa24		
8f081814dede3b5e8825d4cd6e5c9c363b19f21e		
67866f572d8a86e153be747862ec69d5c4477d82	FALSE	
d8fd122573a3a07dedb0b01271fbbe9e47077b7b	FALSE	
3974c0663f06bb0a4e51c69a5d49c61022b8fd5b		
209925efc4daf4fbf9be8d0ac9cf3fe7fc360b24		
1e6c8db2b969e0b438d749b8ed79f6a08fe75d2c		
434f02cad7370eb311370dc8dcc01b377a85f9e8	FALSE	
bb99baf43803e8317e9f40ec033cd2f7a36b87e9	FALSE	
597c409aade8823e17f55ff8711dc896ad293690	FALSE	
20e487bda952ee879baf3ee086928fd08706c564	FALSE	
fffd1d5de18ed84b4db5b15a57540d804700e38c	FALSE	
3155dbc2d6387a8f19e0cf04cd12c3d4501eaef4	FALSE	
a38ef00207b4524282a09fe5cd78defe45caf613	FALSE	
8dc46c982409f2e0e073dd55d6b2e4241173c266	TRUE	FALSE
3228dc3342001440d11c7fb59e0ed697f17dc10f	TRUE	FALSE
a890f778ee3da0f9a88d833ee3aa73bbd9d6406e	TRUE	FALSE
765ae79b138b5cc2543282200466b2ef4be7be8e	TRUE	TRUE


Most don't build, but by some miracle, out of the first 4 that did, it was definitely commit a890f778ee3da0f9a88d833ee3aa73bbd9d6406e that caused this regression.

Probably the changed return values from ParaPortionList::FindParagraph() (0xFFFF to SAL_MAX_UINT32) or FastGetPos() (USHRT_MAX to SAL_MAX_UINT32) are not checked properly by some caller, resulting in it not detecting the element does not exist, and then trying to access an invalid element (possibly NULL?) leads to the crash.
Comment 3 damjan 2024-01-03 04:05:28 UTC
The bug has to do with how integer overflow behaves with different integer type sizes.

Patching main/editeng/source/editeng/editdoc2.cxx to print out its parameters and the computer "nEnd" value shows that, when we have:

static sal_uInt16 FastGetPos( const VoidPtr *pPtrArray, sal_uInt16 nPtrArrayLen,
                          VoidPtr pPtr, sal_uInt16 &rLastPos )

this is printed:

FastGetPos(0x899aca8d0, 1, 0x899c65710, 22)
  nEnd = 1

but when we have:

static sal_uInt16 FastGetPos( const VoidPtr *pPtrArray, sal_uInt32 nPtrArrayLen,
                          VoidPtr pPtr, sal_uInt16 &rLastPos )

this is printed instead:

FastGetPos(0x8951ef290, 1, 0x8951d6bf0, 22)
  nEnd = 24

Why? Because in FastGetPos() we do this:

---snip---
static sal_uInt16 FastGetPos( const VoidPtr *pPtrArray, sal_uInt32 nPtrArrayLen,
                          VoidPtr pPtr, sal_uInt16 &rLastPos )
{
  // Through certain filter code-paths we do a lot of appends, which in
  // turn call GetPos - creating some N^2 nightmares. If we have a
  // non-trivially large list, do a few checks from the end first.
  if( rLastPos > 16 )
    {
      sal_uInt16 nEnd;
      if (rLastPos > nPtrArrayLen - 2)
        nEnd = nPtrArrayLen;
      else
        nEnd = rLastPos + 2;
     
      for( sal_uInt16 nIdx = rLastPos - 2; nIdx < nEnd; nIdx++ )
        {
          if( pPtrArray[ nIdx ] == pPtr )
            {
              rLastPos = nIdx;
              return nIdx;
            }
        }
    }
  // The world's lamest linear search from svarray ...
  for( sal_uInt16 nIdx = 0; nIdx < nPtrArrayLen; nIdx++ )
    if (pPtrArray[ nIdx ] == pPtr )
      return rLastPos = nIdx;
  return USHRT_MAX;
}
---snip---

In both those tests, nPtrArrayLen was 1, but in the 16 bit nPtrArrayLen case, nEnd == 1, which means the "if (rLastPos > nPtrArrayLen - 2)" was true, whereas in the 32 bit nPtrArrayLen case, nEnd == 24, which means that "if" was false.

Even in that "for" loop later, rLastPos - 2 could wrap around to the maximum integer value, if rLastPos is 0 or 1. And this was a problem even in the old code!

That whole FastGetPos() function sucks and should be rewritten.
Comment 4 damjan 2024-01-03 04:12:01 UTC
This patch seems to stop the crash when applied to the original regression commit (a890f778ee). Let's see whether it works on the latest trunk too.

---snip---
diff --git a/main/editeng/source/editeng/editdoc2.cxx b/main/editeng/source/editeng/editdoc2.cxx
index 327872b474..604a658a7b 100644
--- a/main/editeng/source/editeng/editdoc2.cxx
+++ b/main/editeng/source/editeng/editdoc2.cxx
@@ -330,7 +330,7 @@ static sal_uInt32 FastGetPos( const VoidPtr *pPtrArray, sal_uInt32 nPtrArrayLen,
   if( rLastPos > 16 )
     {
       sal_uInt32 nEnd;
-      if (rLastPos > nPtrArrayLen - 2)
+      if (rLastPos + 2 > nPtrArrayLen)
                nEnd = nPtrArrayLen;
       else
                nEnd = rLastPos + 2;
---snip---
Comment 5 damjan 2024-01-03 06:34:38 UTC
It fixes the bug on trunk too.

Patch committed in commit 294ee1400602b0a60dabfaa12034b85c1df06f8f. Resolving FIXED.
Comment 6 damjan 2024-01-03 06:44:04 UTC
Cherry-picked for the AOO42X branch too, in commit 93440cfe8c5159735f8bce693fbafd7479717510.
Comment 7 Arrigo Marchiori 2024-01-03 09:33:39 UTC
So, in the end, this was another case of an "optimization" that was broken and ready to "explode"...

We should remember to look for words like "optimize", "accelerate" etc. when we are trying to fix bugs. ;-)

Great job!
Comment 8 damjan 2024-01-03 15:41:07 UTC
Thank you!