Bug 52320

Summary: Conversion to html : Problem with œ
Product: POI Reporter: Benoit MAGGI <benoit.maggi>
Component: HWPFAssignee: POI Developers List <dev>
Status: RESOLVED INVALID    
Severity: normal    
Priority: P2    
Version: 3.8-dev   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Attachments: Eclipse project with an example of the œ problem with poi WordToHtmlConverter

Description Benoit MAGGI 2011-12-12 10:06:27 UTC
Created attachment 28065 [details]
Eclipse project with an example of the œ problem with poi WordToHtmlConverter

I got a problem with WordToHtmlConverter (poi-3.8-beta4-20110826.jar) : 

The doc input is : 

Cœur 

The html output is :

C?ur 

œ is http://en.wikipedia.org/wiki/%C5%92


Find in attachment an eclipse project with an example of the bug.


You have to add 2 libs to the project (in the lib directory) :
  - poi-3.8-beta4-20110826.jar
  - poi-scratchpad-3.8-beta4-20110826.jar
Comment 1 Yegor Kozlov 2011-12-16 12:42:22 UTC
In Convertor.java change line 45 from

return result.getBytes();

to

return result.getBytes("UTF-8");

and you will be good.

Yegor
Comment 2 Benoit MAGGI 2011-12-16 13:09:38 UTC
Thx. Shame on me for this.