Summary: | [PATCH] Support for getting OLE object data from slide show | ||
---|---|---|---|
Product: | POI | Reporter: | Trejkaz (pen name) <trejkaz> |
Component: | HSLF | Assignee: | POI Developers List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | Keywords: | PatchAvailable |
Priority: | P2 | ||
Version: | 3.0-dev | ||
Target Milestone: | --- | ||
Hardware: | Other | ||
OS: | other | ||
Attachments: |
Proposed patch
ole2-embedding-2003.ppt TestOleEmbedding.java Reviewed patch |
Description
Trejkaz (pen name)
2007-08-29 21:38:22 UTC
Created attachment 20735 [details]
Proposed patch
Created attachment 20736 [details]
ole2-embedding-2003.ppt
Attaching my test file.
Created attachment 20737 [details]
TestOleEmbedding.java
Attaching a simple unit test.
Created attachment 20748 [details]
Reviewed patch
Various fixes to the patch, spotted through code review by a second person.
Patch applied. Thanks for it. However, I will close this bug only when I see unit tests for low-level record classes: src/scratchpad/src/org/apache/poi/hslf/record/ExOleObjStg.java src/scratchpad/src/org/apache/poi/hslf/record/ExOleObjAtom.java src/scratchpad/src/org/apache/poi/hslf/record/ExEmbedAtom.java src/scratchpad/src/org/apache/poi/hslf/record/ExEmbed.java A minimal unit test should take a reference data from a ppt file and verify getters/setters against it. See unit tests in src/scratchpad/testcases/org/apache/poi/hslf/record and follow the pattern. Ideas for further development: (1) it should be possible to access OLE object properties contained in ExEmbed container. Did you figure out how to link ExOleObjStg and the corresponding ExEmbed? My guess is that the order of ExEmbed in Document.ExObjList corresponds to the order of ExOleObjStg records. That is for the 1st ExOleObjStg we take the 1st Document.ExObjList.ExEmbed, etc. Any thoughts? (2) I think OLE shapes should be instances of OLEObjectShape extends SimpleShape. User code may look something like this: Shape[] shape = slide.getShapes(); for (int i=0; i,shape.length; i++){ if(shape[i] instanceof OLEObjectShape){ OLEObjectShape obj = (OLEObjectShape)shape[i]; ObjectData data = obj.getObjectData(); //shoule be able to access object properties String clipboardName = obj.getClipboardName(); if(clipboardName.equals("Microsoft Office Excel Worksheet")){ //do something with the data } } } (3) Can we construct a workbook or a presentation given the binary data retrieved from ObjectData.getData()? Regards, Yegor I'll see if I can get some time to make unit tests for those. I'm not sure if I actually implemented setters so in theory I would only need to check that the getters return the expected values. Only problem is that I'm currently stuck working on something more important (I do all this stuff during office hours, so I can't easily choose to work on unit testing something we don't use ourselves. Since we only actually ended up using the method to get all the objects...) As for the other suggestions... (1) I'm pretty sure you're right about the IDs, since the IDs are small numbers and there is no other obvious way for them to be referenced. The OLE properties are inside the storage itself so any API that goes as far as to allow access to them would need to load the whole filesystem from there. (2) Hmm... might not be a bad idea. I was just following the code done for pictures for this stuff, but in the event where the same object is embedded twice (I'm sure it must be possible since they're referenced by ID) this would allow us to see where they are and eventually render them perhaps. Although with regards to rendering, what I have found is that in the document there is also an EMF snapshop of the OLE object embedded as an ordinary picture. So it may be that it already renders properly if anyone has written a renderer... (3) Yep. Passing that InputStream straight into a POIFSFileSystem results in a working filesystem, which can then be passed into whatever constructor is needed (although what we're doing is writing it to a temporary location first, so that we can potentially read from it multiple times without having to re-get the input stream.) However what I have noticed is that in some cases, saving the InputStream to a file doesn't allow the file to be opened in the actual Office application, even if POI's classes have no problems accessing the contents. >I'm currently stuck working on something more important
No rush. Just put it in your TODO list.
Yegor
Finally I can resolve it. I implemented OLEShape which extends Picture and can be used to retrieve the OLE data and some basic properties (progID, short and full names). The usage is something like this: Shape[] shape = slide.getShapes(); for (int i = 0; i < shape.length; i++) { if (shape[i] instanceof OLEShape) { OLEShape ole = (OLEShape) shape[i]; ObjectData data = ole.getObjectData(); String name = ole.getInstanceName(); if ("Worksheet".equals(name)) { HSSFWorkbook wb = new HSSFWorkbook(data.getData()); } else if ("Document".equals(name)) { HWPFDocument doc = new HWPFDocument(data.getData()); } } } Regards, Yegor |