Bug 10511 - Ant not running with Xerces2J delivered with Ant
Summary: Ant not running with Xerces2J delivered with Ant
Alias: None
Product: Ant
Classification: Unclassified
Component: Build Process (show other bugs)
Version: 1.5
Hardware: PC All
: P3 normal (vote)
Target Milestone: ---
Assignee: Ant Notifications List
Keywords: Xerces2
Depends on:
Reported: 2002-07-05 14:54 UTC by Holger von Thomsen
Modified: 2008-02-22 12:18 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Holger von Thomsen 2002-07-05 14:54:14 UTC

When I am trying to run Ant with Xerces2J, which comes with the Ant 
distributions since Ant 1.5, I get an
'invalid byte 2 of 3-byte UTF 8-sequence'
in my build.xml.

When using old xerces.jar, all works fine.

Java VM is JDK 1.3.1_02-b02 from BEA Weblogic 6.1.
Comment 1 Holger von Thomsen 2002-07-06 09:40:02 UTC

I found out that I had some 'ß' placed in Comments in my Build.Xml-File.
After removing them all seems to work fine.

This Bug seems to be correlated with Bug #9551 assigned to Xalan-dev Mailling 

According to the output of 'build -debug' this seems to be a problem within 
Xerces2J´ UTF8Reader-Class.
Comment 2 Holger von Thomsen 2002-07-19 12:35:47 UTC

I finally found the main error in my build.xml in a missing
<?xml version="1.0" encoding=".."?>-line.

Error seems to be in Xerces2-J having problems detecting encoding when 
mentioned line is missing.

Please close bug.
Comment 3 Steve Loughran 2002-07-19 16:33:14 UTC
What encoding do you have to put in at the top of the build file. UTF-8? Or
something like iso8859-1 ? If the encoding is anything other than utf-8, then it
isnt a xerces bug, it is the XML spec that says ignore the local, demand UTF8

Comment 4 Holger von Thomsen 2002-07-19 16:37:54 UTC

Seems to be iso-8859-1. I think I will correct the build.xml to utf-8 asap.

But that the <?xml version="1.0"?> was missing at all from the xml-file is 
against the xml-spec, too. Or do I understand this wrong?
Comment 5 Steve Loughran 2002-07-19 17:52:25 UTC
the <? xml ?> preprocessor stuff is optional in ant, though it is how xml
parsers distinguish utf-16 from utf-8 encodings (if the first few bytes look
like the unicode representation of <? then it is utf-16, else utf-8)

If you want to use anything other than UTF-8 or UTF-16 in any XML file, you must
include a header that states the encoding. This may seem inconvenient, but what
it does do is guarantee the same behavior, regardless of who is parsing your XML