Summary: | IBM-1047 EBCDIC codepage not supported? | ||
---|---|---|---|
Product: | Xerces-J | Reporter: | bauman |
Component: | SAX | Assignee: | Xerces-J Developers Mailing List <xerces-j-dev> |
Status: | NEW --- | ||
Severity: | blocker | ||
Priority: | P3 | ||
Version: | 1.4.4 | ||
Target Milestone: | --- | ||
Hardware: | Other | ||
OS: | other |
Description
bauman
2002-05-23 21:54:12 UTC
I would guess that the main reason that IBM-1047 is not supported in Xerces is that there is no mention of that codepage in the IANA character set list. Since there is no standard interoperable name that one could use for documents with that encoding, it is not listed in the encoding name table. Without a registered standard name, it is difficult to envision that all XML processors could process such a document, even if they supported the codepage. It looks like IBM-1047 has made it to the IANA list. http://www.iana.org/assignments/character-sets contains the following: Name: IBM1047 [Robrigado] MIBenum: 2102 Source: IBM1047 (EBCDIC Latin 1/Open Systems) http://www- 1.ibm.com/servers/eserver/iseries/software/globalization/pdf/cp01047z.pdf Alias: IBM-1047 Support for this encoding is in Xerces-J2. Yes, I got that. My manager an I petitioned the IANA people for it and they added it in. |