Bug 52320 - Conversion to html : Problem with œ
Summary: Conversion to html : Problem with œ
Status: RESOLVED INVALID
Alias: None
Product: POI
Classification: Unclassified
Component: HWPF (show other bugs)
Version: 3.8-dev
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-12 10:06 UTC by Benoit MAGGI
Modified: 2011-12-16 13:09 UTC (History)
0 users



Attachments
Eclipse project with an example of the œ problem with poi WordToHtmlConverter (11.11 KB, application/x-zip-compressed)
2011-12-12 10:06 UTC, Benoit MAGGI
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Benoit MAGGI 2011-12-12 10:06:27 UTC
Created attachment 28065 [details]
Eclipse project with an example of the œ problem with poi WordToHtmlConverter

I got a problem with WordToHtmlConverter (poi-3.8-beta4-20110826.jar) : 

The doc input is : 

Cœur 

The html output is :

C?ur 

œ is http://en.wikipedia.org/wiki/%C5%92


Find in attachment an eclipse project with an example of the bug.


You have to add 2 libs to the project (in the lib directory) :
  - poi-3.8-beta4-20110826.jar
  - poi-scratchpad-3.8-beta4-20110826.jar
Comment 1 Yegor Kozlov 2011-12-16 12:42:22 UTC
In Convertor.java change line 45 from

return result.getBytes();

to

return result.getBytes("UTF-8");

and you will be good.

Yegor
Comment 2 Benoit MAGGI 2011-12-16 13:09:38 UTC
Thx. Shame on me for this.