Created attachment 24981 [details] apr-iconv-1.2.1-cp932-patch.txt CP932 conversion table of apr-iconv-1.2.1 is different from that of Windows. My patch (apr-iconv-1.2.1-cp932-patch.txt) corrects the problem. My another patch (apr-iconv-1.2.1-cp932-patch2.txt) also corrects the problem, and add some conversions to be compatible with Java and glibc. The cp932_roundtrip.html in cp932_roundtrip.tgz describes what conversions are different from Windows'. (And it also describes conversion tables of other libraries and languages)
Created attachment 24982 [details] apr-iconv-1.2.1-cp932-patch2.txt
Created attachment 24983 [details] cp932_roundtrip.tar.bz2 cp932_roundtrip.tgz is too large for this bugzilla, so I compressed it again by bzip2, and renamed.
Is there anyone who is interested in this problem?
This seems like work which should be incorporated into apr-iconv, but it is confusing since the extensions to cp932 are not mentioned in either Microsoft's or Unicode.org's descriptions of cp932. Microsoft has a long history of calling different character sets by the same name, though. Is there an unambiguous name for this extension of the old cp932?
What does "the extensions to cp932" mean ? (apr-iconv-1.2.1-cp932-patch.txt or apr-iconv-1.2.1-cp932-patch2.txt ?) Does "the old cp932" mean the current imprementation of apr-iconv ? ---- apr-iconv-1.2.1-cp932-patch.txt is as same as The Windows' table. The Unicode.org's CP932-to-Unicode table is http://unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT . It is almost as same as the Windows' table. (0x80 and characters in the private use area are only in Windows' table) The Unicode.org's Unicode-to-CP932 table is the reverse of CP932.TXT. But some characters' Unicode-to-CP932 mappings are ambiguous. To remove the ambiguity, consider http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559 . After remove the ambiguity, we get an Unicode-to-CP932 table. It is almost as same as the Windows' table. (U+0080 and characters in the private use area are only in Windows' table)