Trying to instantiate a SlideShow with the attached file throws the following exception: java.lang.ArrayIndexOutOfBoundsException - 39 at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:140) at org.apache.poi.hslf.record.StyleTextPropAtom.setParentTextSize(StyleTextPropAtom.java:259) at org.apache.poi.hslf.model.TextRun.<init>(TextRun.java:95) at org.apache.poi.hslf.model.TextRun.<init>(TextRun.java:72) at org.apache.poi.hslf.model.Sheet.findTextRuns(Sheet.java:136) at org.apache.poi.hslf.model.Slide.<init>(Slide.java:82) at org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:444) at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:122)
Created attachment 19307 [details] MS powerpoint file This attached MS-PowerPoint presentation caused the exception
The bug has the same origin as Bug 40143. Regards, Yegor
I think this problem has now been fixed, thanks to Yegor's new understanding of the ordering of TextProps in StyleTextPropAtom I can open your test powerpoint document without any exceptions, so I'm hoping this is now closed. If you still get problems, can you re-open with a new problem file?
Got this Exception (through Tika) (POI r1175705) : java.lang.ArrayIndexOutOfBoundsException: 16 at org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:405) at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109)
(In reply to comment #4) > Got this Exception (through Tika) (POI r1175705) : > > java.lang.ArrayIndexOutOfBoundsException: 16 > at > org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:405) > at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109) Any chance you could post a file that shows the problem?
Created attachment 27635 [details] Throws ArrayIndexOutOfBoundsException in SlideShow.buildSlidesAndNotes()
Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4002 Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4002 Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4002 Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4002 Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4002 Found a TextHeaderAtom not followed by a TextBytesAtom or TextCharsAtom: Followed by 4002 Caused by: java.lang.ArrayIndexOutOfBoundsException: 36 at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:95) at org.apache.poi.hslf.record.StyleTextPropAtom.setParentTextSize(StyleTextPropAtom.java:319) at org.apache.poi.hslf.model.TextRun.<init>(TextRun.java:100) at org.apache.poi.hslf.model.TextRun.<init>(TextRun.java:77) at org.apache.poi.hslf.model.Sheet.findTextRuns(Sheet.java:175) at org.apache.poi.hslf.model.Sheet.findTextRuns(Sheet.java:132) at org.apache.poi.hslf.model.Slide.<init>(Slide.java:70) at org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:411) at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109) at com.open.common.ppt.ReportUtility.writePPTNew(ReportUtility.java:1689) at com.auchan.crs.service.report.templatemgt.ReportAjaxService.ajaxCreateReportNew(ReportAjaxService.java:716) at com.auchan.crs.view.action.report.templatemgt.ReportAjaxAction.ajaxCreateReportNew(ReportAjaxAction.java:789) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at com.open.framework.view.DispatchAction.dispatchMethod(DispatchAction.java:264) ... 26 more
I can't believe, this bug can living 4 years. :) POI HSLF 3.8 dev
(In reply to comment #5) > (In reply to comment #4) > > Got this Exception (through Tika) (POI r1175705) : > > > > java.lang.ArrayIndexOutOfBoundsException: 16 > > at > > org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:405) > > at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109) > > Any chance you could post a file that shows the problem? I also have a file that gives similar errors in POI 3.8. I can send it, but not onto a public forum, and I can't clean it up because if I make any changes to it, it no longer gives the error! Is there a way to send it to just a few people, and not have it be available to everyone? java.lang.ArrayIndexOutOfBoundsException: 60 at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:163) at org.apache.poi.hslf.record.StyleTextPropAtom.setParentTextSize(StyleTextPropAtom.java:319) at org.apache.poi.hslf.model.TextRun.<init>(TextRun.java:100) at org.apache.poi.hslf.model.TextRun.<init>(TextRun.java:77) at org.apache.poi.hslf.model.Sheet.findTextRuns(Sheet.java:170) at org.apache.poi.hslf.model.Sheet.findTextRuns(Sheet.java:131) at org.apache.poi.hslf.model.Slide.<init>(Slide.java:70) at org.apache.poi.hslf.usermodel.SlideShow.buildSlidesAndNotes(SlideShow.java:411) at org.apache.poi.hslf.usermodel.SlideShow.<init>(SlideShow.java:109)
Do the problem files pass the Binary File Format Validator? <http://poi.apache.org/faq.html#faq-N10109> And do you know how the files were generated?
(In reply to comment #10) > Do the problem files pass the Binary File Format Validator? > <http://poi.apache.org/faq.html#faq-N10109> And do you know how the files were > generated? Thanks for the validator link, the file DOES pass the validator. I can't tell how it was generated and suspect from an old version of PowerPoint.
Any chance that you could use the tools in org.apache.poi.hslf.dev to produce a hex dump of your StyleTextPropAtoms? Ideally just the problematic one (you might need to fetch all the records, and iterate through setting the text size until you find the one that breaks) HSLF seems to have an idea on the number of parts that make up the style, and it's running off the end part way through reading a style, so we need to get the hex dump to work out why
(In reply to comment #12) > Any chance that you could use the tools in org.apache.poi.hslf.dev to produce a > hex dump of your StyleTextPropAtoms? Ideally just the problematic one (you > might need to fetch all the records, and iterate through setting the text size > until you find the one that breaks) > > HSLF seems to have an idea on the number of parts that make up the style, and > it's running off the end part way through reading a style, so we need to get > the hex dump to work out why I may be able to figure that out, though I would need more help. And I suspect it would then involve a lot of back and forth after that. Nick, can I send you the file instead?
(In reply to comment #13) > I may be able to figure that out, though I would need more help. And I suspect > it would then involve a lot of back and forth after that. > > Nick, can I send you the file instead? I'm a bit busy at work at the moment (paying customers and all that!), so I won't have a chance to look at it soon, sorry. Once of the other POI developers may do though. Another option is just to run the problem code in a debugger, and use that to grab out the header and rawContents of the StyleTextPropAtom that blows up
(In reply to comment #12) > I'm a bit busy at work at the moment (paying customers and all that!), so I > won't have a chance to look at it soon, sorry. Once of the other POI developers > may do though. Thanks, Nick. I will see if any other developer comments, and will send the file to them. In the meantime, I will run it through debug and see what I can find.
fixed with SVN ver r1553760. The problem was/is that there are note references in the SlideListWithText(Notes) with no corresponding note record, i.e. the references point beyond the last record. The fix simply puts null records in the notesRecords list. So user code which uses such malformed files have to check for null notes - it might be nicer to generate default note entries, but as this seems to be rare legacy problem, I've skipped the implementation for that ... Apart of that rewriting the files works.