45541 – Content from the header and footer of an Office 2007 pptx document is not extracted.

Bug 45541 - Content from the header and footer of an Office 2007 pptx document is not extracted.

Summary: Content from the header and footer of an Office 2007 pptx document is not ext...

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	POI
Classification:	Unclassified
Component:	XSLF (show other bugs)
Version:	unspecified
Hardware:	PC Windows Server 2003

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	POI Developers List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2008-08-04 08:42 UTC by xtrim
Modified:	2015-09-28 19:46 UTC (History)
CC List:	0 users

Attachments
Contains JUnit test class and documents used for testing. (564.61 KB, application/x-zip-compressed) 2008-08-04 08:43 UTC, xtrim	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description xtrim 2008-08-04 08:42:29 UTC

The text contained in the header and footer of a power point 2007 document is not extracted.
Find in attachments the JUnit test class and the documents used for testing.
We expected to extract the word "testdoc".

Notes on the attached document:

- the document "Header_1.pptx" contain the word "testdoc" in the header.

- the document "Footer_1.pptx" contain the word "testdoc" in the footer.


"TestUnitPoi35Filter.java" is the JUnit class.

Comment 1 xtrim 2008-08-04 08:43:29 UTC

Created attachment 22365 [details]
Contains JUnit test class and documents used for testing.

The attachment is a ZIP file.

Comment 2 Dominik Stadler 2015-09-28 19:45:08 UTC

At least in the latest version 3.13 this works fine if you request the notes- and master-data in the enhanced getText() call with boolean parameters, you will need to cast to XSLFPowerPointExtractor for these to be available...

I will add a verifying unit test after SVN stops choking on me...

Comment 3 Dominik Stadler 2015-09-28 19:46:06 UTC

e.g. use something like        

       text = ((XSLFPowerPointExtractor)extr).getText(false, true);

to get the notes-data and 

       text = ((XSLFPowerPointExtractor)extr).getText(false, false, true);

to get the data from the master-slide