Bug 41071

Summary: Will not extract text from Powerpoint TextBoxes
Product: POI Reporter: Bj <bjorn.wang>
Component: POI OverallAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: critical    
Priority: P1    
Version: 3.0-dev   
Target Milestone: ---   
Hardware: Other   
OS: other   
Attachments: Powerpoint file containing a TextBox. POI is not able to extract its content (through Nutch).

Description Bj 2006-11-29 04:57:18 UTC
I Use POI through Nutch (search engine).

For some Powerpoint documents, POI will extract content from TextBox instances.
This leaves me with no content to index for search.

I included an example document that demonstrates this behavior.
Comment 1 Bj 2006-11-29 05:27:34 UTC
Created attachment 19198 [details]
Powerpoint file containing a TextBox. POI is not able to extract its content (through Nutch).

No content is extracted from this file, whereas I
Comment 2 Yegor Kozlov 2008-04-16 00:24:06 UTC
Fixed in POI 3.0.3

Yegor