Bug 41015 - method RichTextRun.getText() throws StringIndexOutOfBoundsException
Summary: method RichTextRun.getText() throws StringIndexOutOfBoundsException
Alias: None
Product: POI
Classification: Unclassified
Component: HSLF (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2006-11-22 05:38 UTC by Erez
Modified: 2006-11-28 07:36 UTC (History)
0 users

ppt example (10.00 KB, application/vnd.ms-powerpoint)
2006-11-22 05:39 UTC, Erez
The patch with the fix (2.79 KB, patch)
2006-11-26 05:41 UTC, Yegor Kozlov
Details | Diff
ppt to add to src\scratchpad\testcases\org\apache\poi\hslf\data (12.50 KB, application/vnd.ms-powerpoint)
2006-11-26 05:42 UTC, Yegor Kozlov
Modified test case (15.54 KB, application/octet-stream)
2006-11-26 05:43 UTC, Yegor Kozlov
Modified StyleTextPropAtom (22.91 KB, application/octet-stream)
2006-11-26 05:45 UTC, Yegor Kozlov

Note You need to log in before you can comment on or make changes to this bug.
Description Erez 2006-11-22 05:38:54 UTC
When invoking the method
org.apache.poi.hslf.usermodel.RichTextRun.getText() an exception is thrown - 

It seems that the length member of this instance has a very big value 
(1572863), while the getText() method of the TextRun returns a shorter String 
Comment 1 Erez 2006-11-22 05:39:26 UTC
Created attachment 19158 [details]
ppt example
Comment 2 Yegor Kozlov 2006-11-26 05:40:15 UTC

It looks like the definition of the potential paragraph properties in
StyleTextPropAtom was wrong.

I added the following property to the end:

   new TextProp(2, 0x200000, "para_unknown_7")  
With this change everything works right.

I don't know what it means. Just read it and make sense out of it later.
The patch is attached.

Regards, Yegor
Comment 3 Yegor Kozlov 2006-11-26 05:41:28 UTC
Created attachment 19172 [details]
The patch with the fix
Comment 4 Yegor Kozlov 2006-11-26 05:42:45 UTC
Created attachment 19173 [details]
ppt to add to src\scratchpad\testcases\org\apache\poi\hslf\data
Comment 5 Yegor Kozlov 2006-11-26 05:43:50 UTC
Created attachment 19174 [details]
Modified test case
Comment 6 Yegor Kozlov 2006-11-26 05:45:23 UTC
Created attachment 19175 [details]
Modified StyleTextPropAtom
Comment 7 Nick Burch 2006-11-28 07:36:16 UTC
The data format used in StyleTextPropAtom is so stupid and brittle I'm amazed we
haven't had one of these before...

If we don't know about all the different properties (especially the ones at the
end), we'll think we're done with one set of properties, when there's still data
left for it (since you can't tell that). We'll then try and treat the next bit
of data as the start of a new set of properties, even though it's the data for
the last one. Thus, we end up with really silly values for text length, because
they're near the start :(

Good spot, cheers for the patch. I've committed it.