Bug 64197 - AssertionError thrown when processing embedded EMF in doc file
Summary: AssertionError thrown when processing embedded EMF in doc file
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: 4.1.1-FINAL
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-04 16:30 UTC by Raman Gupta
Modified: 2020-03-04 16:31 UTC (History)
0 users



Attachments
Same file with EMF that throws error (951.11 KB, application/gzip)
2020-03-04 16:31 UTC, Raman Gupta
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Raman Gupta 2020-03-04 16:30:45 UTC
When JVM assertions are enabled via the `-ea` flag, the attached file throws an AssertionError during parsing. When running via Tika 1.23, the stack trace is:

Exception in thread "main" java.lang.AssertionError
        at org.apache.poi.hemf.record.emfplus.HemfPlusRecordIterator._next(HemfPlusRecordIterator.java:84)
        at org.apache.poi.hemf.record.emfplus.HemfPlusRecordIterator.next(HemfPlusRecordIterator.java:55)
        at org.apache.poi.hemf.record.emfplus.HemfPlusRecordIterator.next(HemfPlusRecordIterator.java:26)
        at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
        at org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataPlus.init(HemfComment.java:292)
        at org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataIterator._next(HemfComment.java:216)
        at org.apache.poi.hemf.record.emf.HemfComment$EmfCommentDataIterator.<init>(HemfComment.java:155)
        at org.apache.poi.hemf.record.emf.HemfComment$EmfComment.init(HemfComment.java:110)
        at org.apache.poi.hemf.record.emf.HemfRecordIterator._next(HemfRecordIterator.java:76)
        at org.apache.poi.hemf.record.emf.HemfRecordIterator.next(HemfRecordIterator.java:48)
        at org.apache.poi.hemf.record.emf.HemfRecordIterator.next(HemfRecordIterator.java:27)
        at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
        at org.apache.poi.hemf.usermodel.HemfPicture.getRecords(HemfPicture.java:78)
        at org.apache.poi.hemf.usermodel.HemfPicture.iterator(HemfPicture.java:91)
        at org.apache.tika.parser.microsoft.EMFParser.parse(EMFParser.java:80)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
        at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
        at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:104)
        at org.apache.tika.extractor.EmbeddedDocumentUtil.parseEmbedded(EmbeddedDocumentUtil.java:220)
        at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedResource(AbstractPOIFSExtractor.java:133)
        at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedResource(AbstractPOIFSExtractor.java:107)
        at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedResource(AbstractPOIFSExtractor.java:100)
        at org.apache.tika.parser.microsoft.WordExtractor.handlePictureCharacterRun(WordExtractor.java:561)
        at org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:365)
        at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:187)
        at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175)
        at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
        at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:209)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:496)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:149)
Comment 1 Raman Gupta 2020-03-04 16:31:18 UTC
Created attachment 37061 [details]
Same file with EMF that throws error