Bug 58822

Summary: Error is printed on stderr when parsing some ppt files
Product: POI Reporter: Jiri Banszel <jiri.banszel>
Component: HSLFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 3.13-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Bug Depends on: 58829    
Bug Blocks:    

Description Jiri Banszel 2016-01-08 08:01:40 UTC
We have encountered this problem when using Tika 1.11 that embeds POI 3.13. The error below is logged to stderr when parsing some ppt files. Unfortunately I cannot attach such a file because they are condfidential. Although the error seems to be harmless, we have many such exceptions, and application log is so flooded by them that we had to create a patch for it.

How we patched it: in class org.apache.poi.hslf.record.TxMasterStyleAtom, line 72, there is an exception caught and printed to stderr ("e.printStackTrace();"). Instead we have added "POILogFactory.getLogger(TxMasterStyleAtom.class).log(POILogger.WARN, "Exception when reading available styles", e);".

java.lang.ArrayIndexOutOfBoundsException: 110
        at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:224)
        at org.apache.poi.hslf.model.textproperties.TabStopPropCollection.parseProperty(TabStopPropCollection.java:100)
        at org.apache.poi.hslf.model.textproperties.TextPropCollection.buildTextPropList(TextPropCollection.java:224)
        at org.apache.poi.hslf.record.TxMasterStyleAtom.init(TxMasterStyleAtom.java:157)
        at org.apache.poi.hslf.record.TxMasterStyleAtom.<init>(TxMasterStyleAtom.java:67)
        at sun.reflect.GeneratedConstructorAccessor498.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
        at org.apache.poi.hslf.record.Environment.<init>(Environment.java:54)
        at sun.reflect.GeneratedConstructorAccessor690.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
        at org.apache.poi.hslf.record.Document.<init>(Document.java:122)
        at sun.reflect.GeneratedConstructorAccessor688.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
        at org.apache.poi.hslf.record.Record.buildRecordAtOffset(Record.java:103)
        at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.read(HSLFSlideShowImpl.java:286)
        at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.buildRecords(HSLFSlideShowImpl.java:267)
        at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.<init>(HSLFSlideShowImpl.java:178)
        at org.apache.poi.hslf.usermodel.HSLFSlideShow.<init>(HSLFSlideShow.java:171)
        at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:61)
        at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:149)
        at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
        at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102)
        at org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:219)
        at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:182)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:136)
Comment 1 Tim Allison 2016-01-08 17:08:15 UTC
Thank you for raising this.  We should probably fix this in the handful of other places that we have a printStackTrace...