Bug 47008 - NullPointerException when calling getText() from TextExtractor
Summary: NullPointerException when calling getText() from TextExtractor
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.5-dev
Hardware: PC Windows Vista
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-09 06:48 UTC by Tiago
Modified: 2009-06-19 07:23 UTC (History)
0 users



Attachments
The ppt that contains a shape (throws NullPointerException) (101.00 KB, application/vnd.ms-powerpoint)
2009-04-09 06:48 UTC, Tiago
Details
The ppt that doesn't contains a shape (all OK) (99.50 KB, application/vnd.ms-powerpoint)
2009-04-09 06:49 UTC, Tiago
Details
The pptx that contains a shape and runs OK (33.53 KB, application/vnd.openxmlformats-officedocument.presentationml.presentation)
2009-04-09 06:50 UTC, Tiago
Details
Test case (520 bytes, text/x-java-source)
2009-04-09 07:08 UTC, Tiago
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tiago 2009-04-09 06:48:02 UTC
Created attachment 23471 [details]
The ppt that contains a shape (throws NullPointerException)

This problem happens when having extracting text from a .ppt file.

Appears that the problems resides when extracting text from shapes, since I've made a simple test with a .ppt with a simple shape with text (test1) and another test with .ppt without the shape but with the same text (test2).

On test1 we get a NullPointerException (org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:322)) and on test2 we get the desired text.


Another thing is that when saving the test1 presentation has a .pptx file and then running a third test using that file, we get the desired result.
Comment 1 Tiago 2009-04-09 06:49:49 UTC
Created attachment 23472 [details]
The ppt that doesn't contains a shape (all OK)
Comment 2 Tiago 2009-04-09 06:50:41 UTC
Created attachment 23473 [details]
The pptx that contains a shape and runs OK
Comment 3 Tiago 2009-04-09 07:08:44 UTC
Created attachment 23474 [details]
Test case

Test case: pointing out the .ppt that throws NullPointerException

Referenced libraries are:

poi-3.5-beta5-20090219.jar
poi-contrib-3.5-beta5-20090219.jar
poi-ooxml-3.5-beta5-20090219.jar
poi-scratchpad-3.5-beta5-20090219.jar
log4j.jar
dom4j.jar
ooxml-schemas-1.0.jar
Comment 4 Yegor Kozlov 2009-06-19 07:23:42 UTC
The problem is not reproducible with current trunk. The test case successfully runs against all the attached files.

Yegor