Bug 46182

Summary: PowerPointExtractor immediately throws OutOfMemoryError
Product: POI Reporter: Charlie Hubbard <charlie.hubbard>
Component: HSLFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P1    
Version: 3.2-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   
URL: ftp://www.workgroupsolutions.com/pub/charlie/Presentation-Spani.ppt

Description Charlie Hubbard 2008-11-10 18:44:17 UTC
Trying to extract the text from a power point presentation throws an OutOfMemoryError immediately.  Trying to increase the Heap proved useless.  Trying to create a VM of 2Gigs fails because the VM can't allocate enough space, and anything smaller still provides an OutOfMemoryError.

Here is the following code being executed:

public class Test {
    public static void main(String[] args) throws Exception{
        try {
            PowerPointExtractor _extractor = new PowerPointExtractor(new FileInputStream("Presentation - Spani#113D94.ppt"));
            String _text = _extractor.getText();
            System.out.print(_text);
        } catch( Throwable e ) {
            e.printStacktrace();
            System.out.println( Runtime.getRuntime().freeMemory() + " free out of " + Runtime.getRuntime().totalMemory() );
        }
    }
}
Comment 1 Charlie Hubbard 2008-11-10 18:54:30 UTC
The expectation would be that it would parse correctly given that this powerpoint presentation is only 5MB so it seems like even given 20MB of RAM it could parse it without a problem.  This file won't even parse if you give it 1 Gig!  If you can't parse it then I would expect some more predictable exception to thrown rather than OutOfMemory.
Comment 2 Yegor Kozlov 2008-11-11 02:07:36 UTC
Fixed in r713009

Yegor