Created attachment 28741 [details] The patch to fix encoding problems. A small patch to fix some encoding problems with older Visio documents and handle errors more robustly. I've signed an ICLA for a Derby patch previously.
Finally I had time to review your patch, thanks for your patience. The patch looks good, but needs some work. POI is compatible with JDK 1.5 but your patch isn't. The Chunk class constructs strings with String(byte[],int,int,java.nio.charset.Charset) which wasd introduced in JDK 1.6. The old code used StringUtil.getFromUnicodeLE(contents, startsAt, strLen) which assumed that encoding is always UTF-16LE. I see that in your patch the encoding is either ASCII or UTF-16LE depending on the chunk type. Can you write some unit tests that show it is really so? Regards, Yegor
Created attachment 29111 [details] Patch to fix 1.5 support and add tests. I think my changes in the latest patch should fix the JDK 1.5 support. I've also added two unit tests and a sample file. Cheers, Luke
Patch applied in r1365638 Thanks, Yegor