Created attachment 35141 [details] triggering file I need to figure out if this is a POIFs bug or a parseSummaries bug. This is triggered by a corrupted file. At this location: at org.apache.poi.util.IOUtils.copy(IOUtils.java:296) at org.apache.poi.util.IOUtils.peekFirstNBytes(IOUtils.java:64) at org.apache.poi.hpsf.PropertySet.isPropertySetStream(PropertySet.java:393) at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:191) at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83) at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:73) while((count = inp.read(buff)) != -1) { if(count > 0) { out.write(buff, 0, count); } } On the first iteration, the pos in inp is 0, but then the pos goes negative on each iteration, and this loop iterates for a very long time. The source file that I corrupted is: testEXCEL_embeddedPDF_windows.xls
How can we reproduce this with POI alone? How is the document opened in Tika?
Dominik, I'm sorry for never responding. Y, looks like I could reproduce this in pure POI. fixed r1801989