XSLT transformation doesn't work if I run ANT from directory with NON-ASCII chars. There are more problems. One problem is in wrong Xalan, see also related bug fro Xalan: http://issues.apache.org/jira/browse/XALANJ-2000 If I patch Xalan according to patch from previous bug, then I can include some XSLT from another XSLT, but next problem appeared: I can't use ENTITY declaration correctly. Try to create some directory with NON-ASCII chars and try to transform XML like this one through ANT (using fixed Xalan): ==================================== <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE workspace [ <!ENTITY globaldefinitions SYSTEM "globaldef.xml"> <!ENTITY globaldefinitions_bc SYSTEM "globaldef_bc.xml"> <!ENTITY installationtypes SYSTEM "insttypes.xml"> ]> <workspace> &globaldefinitions; &globaldefinitions_bc; &installationtypes; .... ==================================== Then you receive error something like this one: ==================================== C:\systinet\divný_link\etc\bin>xslt tools.xsl tools.xml bla JAVA_HOME=D:\soft\dev\java Buildfile: D:\bin\xslt.xml xsl: [xslt] Processing C:\systinet\divnø_link\etc\bin\tools.xml to C:\systinet\divnø_link\etc\bin\bl a [xslt] Loading stylesheet C:\systinet\divnø_link\etc\bin\tools.xsl [xslt] : Fatal Error! java.net.MalformedURLException: no protocol: globaldef.xml Cause: java.ne t.MalformedURLException: no protocol: globaldef.xml [xslt] Failed to process C:\systinet\divnø_link\etc\bin\tools.xml ==================================== If I use fixed Xalan directly, then everything seems to be OK. For example I tried call like this one: java -verbose -Xbootclasspath/p:xalan.jar org.apache.xalan.xslt.Process -XSL tools.xsl -IN tools.xml -OUT pokus Previous call edns correctly. It seems, tha problem is somewhere in ANT's implementation of EntityResolver. Please, fix it. Thanks. Verbose output will follow as bug attachment.
Created attachment 16869 [details] Errors from ANT Error output from ANT which tries to transform some XML referencing (relatively) another XMLs from directory with NON-ASCII chars (e.g. "C:\systinet\divný_link\etc\bin").
So, some additional little notice: If I use the same files (unchanged) from directory, where only ASCII (standard) chars are used, then problem also disappeared.
Hello, I am not sure yet what the problem is, but I have been working on a similar issue recently, where a small fix has been done in SVN in the class AntClassLoader. Someone wrote on the user list that ant does not run at all if ANT_HOME contains non ascii character anyway. Cheers, Antoine
Created attachment 16874 [details] file containing elements to reproduce problem (tar.gz) I found out that this problem is not a bug from Ant. I need to investigate further. At this stage, I think that it is a bug of the version of xerces which ships in the JDK 1.5 (and may be 1.4) ? AFAIK javax.xml.parsers.DocumentBuilder#parse(File) and javax.xml.parsers.DocumentBuilder#parse(InputSource) would be unable to resolve file entities when the entities are located in directories containing non ASCII characters. To be precise, I did my tests under JDK 1.5.0_04-b05 on Windows 2000. The next verifications to be done are : 1) has this problem been reported/solved with the Apache XML project ? 2) has this problem been reported/solved with Sun ?
The SVN version of Xerces-Java 2 has still the same bug. I have done a search in JIRA to find what could be related : http://issues.apache.org/jira/browse/XERCESJ-996 getBaseURI returns incorrect value for EntityReferences and http://issues.apache.org/jira/browse/XERCESJ-802 Filenames with spaces in versions later than 2.0.2 (incl).
Hello Radim, I think that a part of the problem might lie in the class javax.xml.DocumentBuilder which is part of the JDK. the method DocumentBuilder#parse does this : String uri = "file:" + f.getAbsolutePath(); if (File.separatorChar == '\\') { uri = uri.replace('\\', '/'); } Instead, it should do this : String uri = f.toURI().toASCIIString(); At least, this change allows to parse a document containing an entity include when like in your case both the main XML document and the entity include are located in a path containing non ASCII chars, such as c:/änt I have filed a bug with Sun with what I have found. I am closing this report as INVALID. Reopen this report only if you can prove that ant is faulty.
Actually, the fix is here : http://issues.apache.org/bugzilla/show_bug.cgi?id=34913 This message http://mail-archives.apache.org/mod_mbox/xml-commons-cvs/200506.mbox/%3c20050603182857.52401.qmail@minotaur.apache.org%3e shows the changes done by Michael Glavassevich. Maybe something like compiling the whole code which is in the file tools/xml-commons-external-src.zip in the xerces repository and putting the corresponding jar file in the bootclasspath would be a work around to your problem. The real solution is to get Sun to deliver a new JDK with all these fixes.
There is a bug report of sun concerning this issue : http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6341770 I have tested this problem against the HEAD revision of xerces, and it does not work either, so I will try to propose a patch.
I changed my mind concerning this issue and decide to fix it in ant by changing FileUtils.toURI(). I am copying the way Xerces encodes a user dir in org.apache.xerces.impl.XMLEntityManager FileUtils.toURI() will now encode non ascii characters like java.io.File.toURI().toAsciiString().
Fixed in CVS. I have added a testcase in FileUtilsTest, but I need to set up another behavioral test for the xslt task.
*** Bug 36513 has been marked as a duplicate of this bug. ***