Created attachment 22384 [details] Contains JUnit test class and documents used for testing. The text contained in a SmartArt object inserted/created in a word 2007 document is not extracted. Find in attachments the JUnit test class and the document used for testing. We expected to extract the words "List1", "process2", "Cycle1", "Pyramid1", "relationship3". Notes on the attached documents: - the document "classic_TextInSmartArt.docx" contains the words "List1", "process2", "Cycle1", "Pyramid1", "relationship3" in the SmartArt objects inserted in the document. "TestUnitPoi35Filter.java" is the JUnit class.
I'm not sure if we want to be going that far down into graphics objects by default. If you'd like to submit a patch to extract the text, along with a flag to toggle the behaviour on/off, I'll happily apply it to svn :)
(In reply to comment #1) > I'm not sure if we want to be going that far down into graphics objects by > default. > If you'd like to submit a patch to extract the text, along with a flag to > toggle the behaviour on/off, I'll happily apply it to svn :) hi, Thanks for your comment. I think the SmartArt objects are not really graphics objects. They are formatted objects allowing the user to enter text. The SmartArt objects used to be named "diagrams" in Office 2003. The text inserted in a diagram in an office 2003 word document is properly extracted. Can we hope this text will be extracted in the future? Regards, Bénédicte