Bug 45552 - Content from a document (docx, xlsx, or pptx) linked to a 2007 pptx document is not extracted.
Summary: Content from a document (docx, xlsx, or pptx) linked to a 2007 pptx document ...
Status: RESOLVED WONTFIX
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: unspecified
Hardware: PC Windows Server 2003
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-05 05:33 UTC by xtrim
Modified: 2016-04-10 11:43 UTC (History)
0 users



Attachments
Contains JUnit test class and documents used for testing. (662.07 KB, application/x-zip-compressed)
2008-08-05 05:33 UTC, xtrim
Details

Note You need to log in before you can comment on or make changes to this bug.
Description xtrim 2008-08-05 05:33:43 UTC
Created attachment 22375 [details]
Contains JUnit test class and documents used for testing.

The text contained in a document linked to the current ppt 2007 document is not extracted.
Find in attachments the JUnit test class and the documents used for testing.
We expected to extract the word "testdoc".

Notes on the attached documents:


- the document "ContentLinkedObject_word.pptx" contains the word "testdoc" in the docx linked document.

- the document "ContentLinkedObject_excel.pptx" contains the word "testdoc" in the xlsx linked document.

- the document "ContentLinkedObject_ppt.pptx" contains the word "testdoc" in the pptx linked document.


"TestUnitPoi35Filter.java" is the JUnit class.
Comment 1 Dominik Stadler 2016-04-10 11:43:00 UTC
As far as I see with LibreOffice, these are actually hyperlinks to local files, so not an embedded document, so I don't think POI should try to extract text from those by default anyway. So at most this would be some advanced option to enable, but even then it is likely better done in user-space, i.e. you can iterate the shapes and see if there are hyperlinks and then try to open those documents pointed to by the hyperlinks.

For now I don't think we plan to work on this in POI itself until someone proposes patches together with proper unit-test coverage.