Bug 53205

Summary: [PATCH] Fix some parsing errors and encoding issues
Product: POI Reporter: Luke Quinane <luke>
Component: HDGFAssignee: POI Developers List <dev>
Severity: normal    
Priority: P2    
Version: 3.9-dev   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: The patch to fix encoding problems.
Patch to fix 1.5 support and add tests.

Description Luke Quinane 2012-05-08 07:11:58 UTC
Created attachment 28741 [details]
The patch to fix encoding problems.

A small patch to fix some encoding problems with older Visio documents and handle errors more robustly.

I've signed an ICLA for a Derby patch previously.
Comment 1 Yegor Kozlov 2012-07-22 11:12:32 UTC
Finally I had time to review your patch, thanks for your patience.

The patch looks good, but needs some work.

 POI is compatible with JDK 1.5 but your patch isn't. The Chunk class constructs strings with String(byte[],int,int,java.nio.charset.Charset) which wasd introduced in JDK 1.6. The old code used StringUtil.getFromUnicodeLE(contents, startsAt, strLen) which assumed that encoding is always UTF-16LE. I see that in your patch the encoding is either ASCII or UTF-16LE depending on the chunk type. Can you write some unit tests that show it is really so? 

Comment 2 Luke Quinane 2012-07-25 05:34:42 UTC
Created attachment 29111 [details]
Patch to fix 1.5 support and add tests.

I think my changes in the latest patch should fix the JDK 1.5 support. I've also added two unit tests and a sample file.


Comment 3 Yegor Kozlov 2012-07-25 15:45:57 UTC
Patch applied in r1365638