Trying to extract the text from a power point presentation throws an OutOfMemoryError immediately. Trying to increase the Heap proved useless. Trying to create a VM of 2Gigs fails because the VM can't allocate enough space, and anything smaller still provides an OutOfMemoryError. Here is the following code being executed: public class Test { public static void main(String[] args) throws Exception{ try { PowerPointExtractor _extractor = new PowerPointExtractor(new FileInputStream("Presentation - Spani#113D94.ppt")); String _text = _extractor.getText(); System.out.print(_text); } catch( Throwable e ) { e.printStacktrace(); System.out.println( Runtime.getRuntime().freeMemory() + " free out of " + Runtime.getRuntime().totalMemory() ); } } }
The expectation would be that it would parse correctly given that this powerpoint presentation is only 5MB so it seems like even given 20MB of RAM it could parse it without a problem. This file won't even parse if you give it 1 Gig! If you can't parse it then I would expect some more predictable exception to thrown rather than OutOfMemory.
Fixed in r713009 Yegor