Bug 56335 - Could not extract text from embedded SmartArt grafic
Summary: Could not extract text from embedded SmartArt grafic
Status: NEEDINFO
Alias: None
Product: POI
Classification: Unclassified
Component: HSLF (show other bugs)
Version: 3.10-FINAL
Hardware: PC All
: P2 normal with 2 votes (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-01 09:29 UTC by Christian Czech
Modified: 2014-06-11 13:52 UTC (History)
1 user (show)



Attachments
error file (431.00 KB, application/vnd.ms-powerpoint)
2014-04-01 09:29 UTC, Christian Czech
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Czech 2014-04-01 09:29:02 UTC
Created attachment 31463 [details]
error file

Hi,

how can I extract SmartArt grafics/objects from ppt?

Regards

Christian
Comment 1 Michael Schiffer 2014-06-11 13:14:08 UTC
Hello,
I'm so to say the successor of Christian in this matter which he had asked you in April this year. Because we don't know whether Christian has got any answers from you, but the problems still exists, please let me ask again, how to extract SmartArt graphics/objects from ppt?
Thanks in advance
Michael
Comment 2 Nick Burch 2014-06-11 13:52:16 UTC
I don't know enough about SmartArt to know the answer, but I can tell you the approach you'll need

What you'll probably need to do is look at the Microsoft Binary File Format documentation, and see what kinds of records / resources SmartArt gets stored in. Next, try the hslf.dev tools to see if POI already has read support for them. (If it's Escher / DDF based, we probably do, if not we may not). If there's read support missing, add in code for the required additional records.

Finally, once the appropriate records are identified and being read, write some high level usermodel code to access them and return the interesting information, then submit a patch + unit tests here so we can roll it into POI!