Bug 37348 - XSLT transformation doesn't work if I run ANT from directory with NON-ASCII chars
Summary: XSLT transformation doesn't work if I run ANT from directory with NON-ASCII c...
Status: RESOLVED FIXED
Alias: None
Product: Ant
Classification: Unclassified
Component: Core tasks (show other bugs)
Version: 1.6.5
Hardware: Other other
: P2 normal (vote)
Target Milestone: 1.7.0
Assignee: Ant Notifications List
URL:
Keywords:
: 36513 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-11-03 18:37 UTC by Radim Literak
Modified: 2008-02-22 12:18 UTC (History)
1 user (show)



Attachments
Errors from ANT (13.35 KB, text/plain)
2005-11-03 18:47 UTC, Radim Literak
Details
file containing elements to reproduce problem (tar.gz) (1.35 KB, application/x-gzip)
2005-11-05 00:07 UTC, Antoine Levy-Lambert
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Radim Literak 2005-11-03 18:37:44 UTC
XSLT transformation doesn't work if I run ANT from directory with NON-ASCII 
chars.

There are more problems. One problem is in wrong Xalan, see also related bug 
fro Xalan:

http://issues.apache.org/jira/browse/XALANJ-2000

If I patch Xalan according to patch from previous bug, then I can include some 
XSLT from another XSLT, but next problem appeared: I can't use ENTITY 
declaration correctly.

Try to create some directory with NON-ASCII chars and try to transform XML like 
this one through ANT (using fixed Xalan):


====================================

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE workspace [
   <!ENTITY globaldefinitions SYSTEM "globaldef.xml">
   <!ENTITY globaldefinitions_bc SYSTEM "globaldef_bc.xml">
   <!ENTITY installationtypes SYSTEM "insttypes.xml">
]>
<workspace>

  &globaldefinitions;
  &globaldefinitions_bc;
  &installationtypes;

  ....

====================================

Then you receive error something like this one:

====================================

C:\systinet\divný_link\etc\bin>xslt tools.xsl tools.xml bla
JAVA_HOME=D:\soft\dev\java
Buildfile: D:\bin\xslt.xml

xsl:
     [xslt] Processing C:\systinet\divnø_link\etc\bin\tools.xml to 
C:\systinet\divnø_link\etc\bin\bl
a
     [xslt] Loading stylesheet C:\systinet\divnø_link\etc\bin\tools.xsl
     [xslt] : Fatal Error! java.net.MalformedURLException: no protocol: 
globaldef.xml Cause: java.ne
t.MalformedURLException: no protocol: globaldef.xml
     [xslt] Failed to process C:\systinet\divnø_link\etc\bin\tools.xml

====================================

If I use fixed Xalan directly, then everything seems to be OK. For example I 
tried call like this one:

java -verbose -Xbootclasspath/p:xalan.jar org.apache.xalan.xslt.Process -XSL 
tools.xsl -IN tools.xml -OUT pokus

Previous call edns correctly.


It seems, tha problem is somewhere in ANT's implementation of EntityResolver. 
Please, fix it. Thanks. Verbose output will follow as bug attachment.
Comment 1 Radim Literak 2005-11-03 18:47:43 UTC
Created attachment 16869 [details]
Errors from ANT

Error output from ANT which tries to transform some XML referencing
(relatively) another XMLs from directory with NON-ASCII chars (e.g.
"C:\systinet\divný_link\etc\bin").
Comment 2 Radim Literak 2005-11-03 18:56:36 UTC
So, some additional little notice: If I use the same files (unchanged) from 
directory, where only ASCII (standard) chars are used, then problem also 
disappeared.
Comment 3 Antoine Levy-Lambert 2005-11-03 19:40:55 UTC
Hello,
I am not sure yet what the problem is, but I have been working on a similar
issue recently, where a small fix has been done in SVN in the class AntClassLoader.
Someone wrote on the user list that ant does not run at all if ANT_HOME contains
non ascii character anyway.

Cheers,

Antoine
Comment 4 Antoine Levy-Lambert 2005-11-05 00:07:22 UTC
Created attachment 16874 [details]
file containing elements to reproduce problem (tar.gz)

I found out that this problem is not a bug from Ant. 
I need to investigate further. At this stage, I think that it is a bug of the
version of xerces which ships in the JDK 1.5 (and may be 1.4) ?
AFAIK javax.xml.parsers.DocumentBuilder#parse(File)
and javax.xml.parsers.DocumentBuilder#parse(InputSource) would be unable to
resolve file entities when the entities are located in directories containing
non ASCII characters.
To be precise, I did my tests under JDK 1.5.0_04-b05 on Windows 2000.
The next verifications to be done are :
1) has this problem been reported/solved with the Apache XML project ?
2) has this problem been reported/solved with Sun ?
Comment 5 Antoine Levy-Lambert 2005-11-05 01:00:30 UTC
The SVN version of Xerces-Java 2 has still the same bug.
I have done a search in JIRA to find what could be related :
http://issues.apache.org/jira/browse/XERCESJ-996
getBaseURI returns incorrect value for EntityReferences

and 
http://issues.apache.org/jira/browse/XERCESJ-802
Filenames with spaces in versions later than 2.0.2 (incl).
Comment 6 Antoine Levy-Lambert 2005-11-05 03:39:06 UTC
Hello Radim,
I think that a part of the problem might lie in the class
javax.xml.DocumentBuilder which is part of the JDK.
the method DocumentBuilder#parse does this :

        String uri = "file:" + f.getAbsolutePath();
        if (File.separatorChar == '\\') {
            uri = uri.replace('\\', '/');
        }

Instead, it should do this :

        String uri = f.toURI().toASCIIString();

At least, this change allows to parse a document containing an entity include
when like in your case both the main XML document and the entity include are
located in a path containing non ASCII chars, such as c:/änt

I have filed a bug with Sun with what I have found.

I am closing this report as INVALID.
Reopen this report only if you can prove that ant is faulty.
Comment 7 Antoine Levy-Lambert 2005-11-05 04:16:00 UTC
Actually, the fix is here :
http://issues.apache.org/bugzilla/show_bug.cgi?id=34913
This message
http://mail-archives.apache.org/mod_mbox/xml-commons-cvs/200506.mbox/%3c20050603182857.52401.qmail@minotaur.apache.org%3e
shows the changes done by Michael Glavassevich.
Maybe something like compiling the whole code which is in the file
tools/xml-commons-external-src.zip in the xerces repository and putting the
corresponding jar file in the bootclasspath would be a work around to your problem.
The real solution is to get Sun to deliver a new JDK with all these fixes.
Comment 8 Antoine Levy-Lambert 2005-11-09 07:29:45 UTC
There is a bug report of sun concerning this issue :
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6341770
I have tested this problem against the HEAD revision of xerces, and it does not
work either, so I will try to propose a patch.
Comment 9 Antoine Levy-Lambert 2005-11-16 23:23:48 UTC
I changed my mind concerning this issue and decide to fix it in ant by changing
FileUtils.toURI(). 
I am copying the way Xerces encodes a user dir in
org.apache.xerces.impl.XMLEntityManager
FileUtils.toURI() will now encode non ascii characters like
java.io.File.toURI().toAsciiString().
Comment 10 Antoine Levy-Lambert 2005-11-16 23:47:18 UTC
Fixed in CVS. I have added a testcase in FileUtilsTest, but I need to set up
another behavioral test for the xslt task.
Comment 11 Antoine Levy-Lambert 2006-07-13 03:39:53 UTC
*** Bug 36513 has been marked as a duplicate of this bug. ***