Bug 45563 - poi-3.5-beta1-20080718.jar - content from the slide notes of a 2003 ppt document is not extracted.
Summary: poi-3.5-beta1-20080718.jar - content from the slide notes of a 2003 ppt docum...
Status: RESOLVED WORKSFORME
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: unspecified
Hardware: PC Windows Server 2003
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
: 45546 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-08-05 07:46 UTC by xtrim
Modified: 2008-08-05 15:53 UTC (History)
0 users



Attachments
Contains JUnit test class and documents used for testing. (253.15 KB, application/x-zip-compressed)
2008-08-05 07:46 UTC, xtrim
Details

Note You need to log in before you can comment on or make changes to this bug.
Description xtrim 2008-08-05 07:46:10 UTC
Created attachment 22386 [details]
Contains JUnit test class and documents used for testing.

The text contained in the notes inserted in the slides of a power point 2003 document is not extracted.
Find in attachments the JUnit test class and the documents used for testing.
We expected to extract the words "testdoc" and "test phrase".

Notes on the attached documents:

- the document "SlideNote.ppt" contains the words "testdoc" and "test phrase" in the notes inserted at the end of the slides.


"TestUnitPoi35Filter.java" is the JUnit class.
Comment 1 Nick Burch 2008-08-05 15:48:17 UTC
Notes text is not extracted by default. You need to set a flag to request it, or call one of the overloaded getText methods

http://poi.apache.org/apidocs/org/apache/poi/hslf/extractor/PowerPointExtractor.html#setNotesByDefault(boolean)
Comment 2 Nick Burch 2008-08-05 15:53:06 UTC
*** Bug 45546 has been marked as a duplicate of this bug. ***