Bug 24570

Summary: POI don't extract right properties in Chinese characters
Product: POI Reporter: Yanjun Liao <Yanjun.Liao>
Component: HPSFAssignee: POI Developers List <dev>
Status: CLOSED FIXED    
Severity: normal    
Priority: P3    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: All   
Attachments: A file of properties in chinese characters

Description Yanjun Liao 2003-11-10 17:40:42 UTC
I am using jakarta-poi-1.10.0-dev version. I tried 2.0-pre3, it seems it has 
the same problem. I have a file which properties are in Chinese characters. POI 
can extract the properties (DocumentSummaryInformation and SummaryInformation), 
but the characters are all messed up after extraction. I believe that it can 
not handle the right encoding in TypeReader class. My OS is an English 
operating system, but it should not be a reason why POI is not working right. 
Because I had a similar program to extract properties which is written in c++, 
it can do the right thing. BTW, I tried Japanse characters, it has the same 
problem. I will attach this file, hopefully, this problem can be addressed 
soon, because it is the critical path of my project now. Thanks
Comment 1 Yanjun Liao 2003-11-10 17:41:34 UTC
Created attachment 9024 [details]
A file of properties in chinese characters
Comment 2 Rainer Klute 2003-11-10 21:49:05 UTC
I am willing to work on this. However, I'd need some general information about
code pages. If someone has a hint (link), please let me know!
Comment 3 Rainer Klute 2003-12-02 17:50:19 UTC
I just added codepage support to the CVS HEAD. Your sample document looks okay
to me (which doesn't necessarily mean anything). Please get the HEAD from the
CVS repository and cross check!
Comment 4 Yanjun Liao 2004-02-02 16:48:57 UTC
Hi, I just download poi-bin-2.0-final-20040126.zip and poi-bin-2.0-RC2-
20040102.zip. Neither one of them seem to solve the problem. Does your fix go 
into these two release yet? If not, how can I get this fix without using CVS? 
Because I am behind company firewall. Thanks a lot.
Comment 5 Rainer Klute 2004-02-02 20:56:10 UTC
Sorry, but the codepage support is in the CVS HEAD only, not in the 2.0 release.
However, I don't know what you could to do get it through a firewall.
Comment 6 Yanjun Liao 2004-02-03 19:31:17 UTC
Hi, I got the snopshot from http://cvs.apache.org/snapshots/jakarta-poi/. It 
works fine. Thanks.