Bug 41071 - Will not extract text from Powerpoint TextBoxes
Summary: Will not extract text from Powerpoint TextBoxes
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: 3.0-dev
Hardware: Other other
: P1 critical with 7 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-29 04:57 UTC by Bj
Modified: 2008-04-16 00:24 UTC (History)
0 users



Attachments
Powerpoint file containing a TextBox. POI is not able to extract its content (through Nutch). (16.00 KB, application/vnd.ms-powerpoint)
2006-11-29 05:27 UTC, Bj
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bj 2006-11-29 04:57:18 UTC
I Use POI through Nutch (search engine).

For some Powerpoint documents, POI will extract content from TextBox instances.
This leaves me with no content to index for search.

I included an example document that demonstrates this behavior.
Comment 1 Bj 2006-11-29 05:27:34 UTC
Created attachment 19198 [details]
Powerpoint file containing a TextBox. POI is not able to extract its content (through Nutch).

No content is extracted from this file, whereas I
Comment 2 Yegor Kozlov 2008-04-16 00:24:06 UTC
Fixed in POI 3.0.3

Yegor