I found an exception when parsing a simple xml as following: ------------begin of xml-------------- <?xml version="1.0" encoding="UTF-8"?> <root> <!-- Value of tag 'child1' is a character 0x96 --> <child1>–</child1> </root> ------------end of xml-------------- Here is the exception using DomParser of Xalan 1.2.2 and Xercer 1.4.4: ------------begin of the exception-------------- org.xml.sax.SAXParseException: The element type "child1" must be terminated by the matching end-tag "</child1>". at org.apache.xerces.framework.XMLParser.reportError (XMLParser.java:1016) at org.apache.xerces.framework.XMLDocumentScanner.reportFatalXMLError (XMLDocumentScanner.java:634) at org.apache.xerces.framework.XMLDocumentScanner.abortMarkup (XMLDocumentScanner.java:683) at org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch (XMLDocumentScanner.java:1187) at org.apache.xerces.framework.XMLDocumentScanner.parseSome (XMLDocumentScanner.java:380) at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:908) at TestParser.parseToNode(TestParser.java:57) at TestParser.run(TestParser.java:26) at TestParser.main(TestParser.java:18) ------------end of the exception-------------- Here is my java codes to parse the above simple xml-data: public static Node parseToNode(InputStream stream) throws Exception { DOMParser parser = new DOMParser(); parser.setFeature("http://xml.org/sax/features/validation", true); parser.parse(new InputSource(stream)); Document dom = parser.getDocument(); Node node = dom.getFirstChild(); return node; }
(Just happened to be looking at old bugs) This is actually a Xerces question (or perhaps a user error/illegal XML char question?) not a Xalan one.