Bug 53205 - [PATCH] Fix some parsing errors and encoding issues
Summary: [PATCH] Fix some parsing errors and encoding issues
Alias: None
Product: POI
Classification: Unclassified
Component: HDGF (show other bugs)
Version: 3.9-dev
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
Depends on:
Reported: 2012-05-08 07:11 UTC by Luke Quinane
Modified: 2012-07-25 15:45 UTC (History)
0 users

The patch to fix encoding problems. (9.11 KB, application/octet-stream)
2012-05-08 07:11 UTC, Luke Quinane
Patch to fix 1.5 support and add tests. (31.73 KB, application/octet-stream)
2012-07-25 05:34 UTC, Luke Quinane

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Quinane 2012-05-08 07:11:58 UTC
Created attachment 28741 [details]
The patch to fix encoding problems.

A small patch to fix some encoding problems with older Visio documents and handle errors more robustly.

I've signed an ICLA for a Derby patch previously.
Comment 1 Yegor Kozlov 2012-07-22 11:12:32 UTC
Finally I had time to review your patch, thanks for your patience.

The patch looks good, but needs some work.

 POI is compatible with JDK 1.5 but your patch isn't. The Chunk class constructs strings with String(byte[],int,int,java.nio.charset.Charset) which wasd introduced in JDK 1.6. The old code used StringUtil.getFromUnicodeLE(contents, startsAt, strLen) which assumed that encoding is always UTF-16LE. I see that in your patch the encoding is either ASCII or UTF-16LE depending on the chunk type. Can you write some unit tests that show it is really so? 

Comment 2 Luke Quinane 2012-07-25 05:34:42 UTC
Created attachment 29111 [details]
Patch to fix 1.5 support and add tests.

I think my changes in the latest patch should fix the JDK 1.5 support. I've also added two unit tests and a sample file.


Comment 3 Yegor Kozlov 2012-07-25 15:45:57 UTC
Patch applied in r1365638