Bug 61300 - Very slow processing on corrupted file
Summary: Very slow processing on corrupted file
Status: NEW
Alias: None
Product: POI
Classification: Unclassified
Component: POIFS (show other bugs)
Version: 3.17-dev
Hardware: PC All
: P2 minor (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-14 12:37 UTC by Tim Allison
Modified: 2017-07-21 20:32 UTC (History)
0 users



Attachments
triggering file (60.50 KB, application/x-ole-storage)
2017-07-14 12:37 UTC, Tim Allison
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Allison 2017-07-14 12:37:05 UTC
Created attachment 35141 [details]
triggering file

I need to figure out if this is a POIFs bug or a parseSummaries bug.  This is triggered by a corrupted file.

At this location:
	  at org.apache.poi.util.IOUtils.copy(IOUtils.java:296)
	  at org.apache.poi.util.IOUtils.peekFirstNBytes(IOUtils.java:64)
	  at org.apache.poi.hpsf.PropertySet.isPropertySetStream(PropertySet.java:393)
	  at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:191)
	  at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)
	  at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:73)

        while((count = inp.read(buff)) != -1) {
            if(count > 0) {
                out.write(buff, 0, count);
            }
        }

On the first iteration, the pos in inp is 0, but then the pos goes negative on each iteration, and this loop iterates for a very long time.

The source file that I corrupted is: testEXCEL_embeddedPDF_windows.xls