Created attachment 31341 [details] Visio .vsd file Exception in thread "main" java.lang.RuntimeException: TODO at org.apache.poi.hdgf.pointers.PointerFactory.createPointer(PointerFactory.java:45) at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:99) at org.apache.poi.hdgf.HDGFDiagram.<init>(HDGFDiagram.java:60) Visio .vsd file causes RuntimeException. The file can viewed successfully in IE version 8
Looks like you have an older v5 file, which HDGF currently only has v6 pointer support Are you interested in helping to add support for this? It looks like vsdump has support for v5 points, and while we can't copy their code (vsdump is mostly GPL, plus it's in c!), we can use the outputs of vsdump to help debug and investigate the file to identify what's needed
Don't mind having a go, but I am not sure my skills are up to it. What do you suggest. What would be the plan?
Created attachment 31343 [details] Slide1 vsd file format details - slide 1
Created attachment 31344 [details] Slide 2 vsd file format details - slide 2
.vsd file format details - slides 1 & 2 also, the python tool ole-toy seems to be the best bet for analyzing any ole file formats, including .vsd; this was written to help reverse engineer .vsd for LibreOffice see: http://libregraphicsworld.org/blog/entry/initial-support-for-visio-files-lands-to-libreoffice
You'll probably want to use POIFSDump and POIFSViewer to see the raw data in the pointers stream, get bits out to play with etc Next up, try using vsdump to parse out the pointers from the test file In TestPointerFactory you'll see some examples of the raw bytes of some pointers, along with what they mean. We'll want to identify some pointer bytes, what they correspond to, then write some more unit test bits like that. Finally, we'll want to add logic to the PointerFactory to decode them. One other thing - both LibreOffice and vsdump are under incompatible licenses, so we can't take code from either of them. We can use them to debug, to analyse, to test, but not to borrow!
The libvisio library has been changed to MPL v2 (on the 31-01-2014). See http://cgit.freedesktop.org/libreoffice/contrib/libvisio I believe that MPL v2 is compatible with the Apache Licence and hence we can used the code. Is this true?
We can depend on a MPLv2 licensed library, but we can't borrow code from one