Bug 51678

Summary: Extracting text from Bug51524.zip is slow
Product: POI Reporter: Antoni Mylka <antoni.mylka>
Component: HWPFAssignee: POI Developers List <dev>
Status: VERIFIED FIXED    
Severity: normal    
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Antoni Mylka 2011-08-18 13:18:11 UTC
The fix to the issue number 51524 solved the problem of a slow constructor. It takes 2 seconds on my machine now. It's still difficult to get any text from that document:

HWPFDocument d = HWPFTestDataSamples.openSampleFileFromArchive( "Bug51524.zip" );
WordExtractor e = new WordExtractor(d);
e.getText();

It seems to spend 99,99% of its time in o.a.p.hwpf.usermodel.Range.findRange(). Dunno if it's possible to do anything about it.
Comment 1 Sergey Vladimirov 2011-08-18 14:29:57 UTC
4 seconds in trunk now (including constructor)
Comment 2 Antoni Mylka 2011-08-18 15:11:25 UTC
You're fast. I had a 90%-working binary search implementation myself, after 4 hours. Gotta seriously brush up on my TopCoder skills.

Thanks very much anyway.