Bug 64132

Summary: Regression in handling UnknownEscherRecord when getting pictures in .doc files
Product: POI Reporter: Tim Allison <tallison>
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: regression    
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Bug Depends on: 64036    
Bug Blocks:    
Attachments: Triggering file

Description Tim Allison 2020-02-10 18:02:57 UTC
Created attachment 36999 [details]
Triggering file

In our recent regression tests, there were six files that had new exceptions with the stacktrace below.

Not sure of best way to handle this...?

java.lang.ClassCastException: class org.apache.poi.ddf.UnknownEscherRecord cannot be cast to class org.apache.poi.ddf.EscherBlipRecord (org.apache.poi.ddf.UnknownEscherRecord and org.apache.poi.ddf.EscherBlipRecord are in unnamed module of loader 'app')
	at org.apache.poi.ddf.EscherBSERecord.fillFields(EscherBSERecord.java:100)
	at org.apache.poi.hwpf.model.PICFAndOfficeArtData.<init>(PICFAndOfficeArtData.java:78)
	at org.apache.poi.hwpf.usermodel.Picture.<init>(Picture.java:112)
	at org.apache.poi.hwpf.model.PicturesTable.extractPicture(PicturesTable.java:162)
	at org.apache.poi.hwpf.model.PicturesTable.getAllPictures(PicturesTable.java:233)
	at org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(WordExtractor.java:654)
	at org.apache.tika.parser.microsoft.WordExtractor$PicturesSource.<init>(WordExtractor.java:644)
	at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:173)
	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175)
	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
Comment 1 Dominik Stadler 2021-01-03 21:32:18 UTC
Running "git bisect" identifies change r1873187 being related here.
Comment 2 Dominik Stadler 2021-01-04 05:51:53 UTC
Fixed via r1885092, these documents should be parsed as before again now.