Bug 9368 - IBM-1047 EBCDIC codepage not supported?
Summary: IBM-1047 EBCDIC codepage not supported?
Status: NEW
Alias: None
Product: Xerces-J
Classification: Unclassified
Component: SAX (show other bugs)
Version: 1.4.4
Hardware: Other other
: P3 blocker
Target Milestone: ---
Assignee: Xerces-J Developers Mailing List
Depends on:
Reported: 2002-05-23 21:54 UTC by bauman
Modified: 2004-11-16 19:05 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description bauman 2002-05-23 21:54:12 UTC
TThere doesn't seem to be a supported encoding for the Latin-1 EBCDIC codepage 
for OS/390 (IBM-1047) in Xerces-J.  Xerces-C allows "ibm-1047-s390", but why not 
a supported encoding for Xerces-J?  I would use "ebcdic-cp-us", but there are 
some basic differences in the codepages (like the "[" and "]" characters) that 
make this a showstopper.
Comment 1 Glenn Marcy 2002-05-23 22:45:34 UTC
I would guess that the main reason that IBM-1047 is not supported in Xerces
is that there is no mention of that codepage in the IANA character set list.
Since there is no standard interoperable name that one could use for documents 
with that encoding, it is not listed in the encoding name table.  Without a
registered standard name, it is difficult to envision that all XML processors
could process such a document, even if they supported the codepage.
Comment 2 Matthew Sykes 2003-06-02 21:26:02 UTC
It looks like IBM-1047 has made it to the IANA list.

http://www.iana.org/assignments/character-sets contains the following:

Name: IBM1047                                                [Robrigado]
MIBenum: 2102
Source: IBM1047 (EBCDIC Latin 1/Open Systems)
Alias: IBM-1047
Comment 3 Michael Glavassevich 2003-06-02 21:36:12 UTC
Support for this encoding is in Xerces-J2.
Comment 4 bauman 2003-06-09 17:57:21 UTC
Yes, I got that.  My manager an I petitioned the IANA people for it and they 
added it in.