Bug 56335

Summary: Could not extract text from embedded SmartArt grafic
Product: POI Reporter: Christian Czech <c.czech>
Component: HSLFAssignee: POI Developers List <dev>
Status: NEEDINFO ---    
Severity: normal CC: m.schiffer
Priority: P2    
Version: 3.10-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: error file

Description Christian Czech 2014-04-01 09:29:02 UTC
Created attachment 31463 [details]
error file


how can I extract SmartArt grafics/objects from ppt?


Comment 1 Michael Schiffer 2014-06-11 13:14:08 UTC
I'm so to say the successor of Christian in this matter which he had asked you in April this year. Because we don't know whether Christian has got any answers from you, but the problems still exists, please let me ask again, how to extract SmartArt graphics/objects from ppt?
Thanks in advance
Comment 2 Nick Burch 2014-06-11 13:52:16 UTC
I don't know enough about SmartArt to know the answer, but I can tell you the approach you'll need

What you'll probably need to do is look at the Microsoft Binary File Format documentation, and see what kinds of records / resources SmartArt gets stored in. Next, try the hslf.dev tools to see if POI already has read support for them. (If it's Escher / DDF based, we probably do, if not we may not). If there's read support missing, add in code for the required additional records.

Finally, once the appropriate records are identified and being read, write some high level usermodel code to access them and return the interesting information, then submit a patch + unit tests here so we can roll it into POI!