144058 – URL.getFile() != URL.getFile() in 6.5m1 and 6.5beta

This Bugzilla instance is a read-only archive of historic NetBeans bug reports. To report a bug in NetBeans please follow the project's instructions for reporting issues.

Bug 144058 - URL.getFile() != URL.getFile() in 6.5m1 and 6.5beta

Summary: URL.getFile() != URL.getFile() in 6.5m1 and 6.5beta

Status:	RESOLVED INVALID

Alias:	None

Product:	platform
Classification:	Unclassified
Component:	Module System (show other bugs)
Version:	6.x
Hardware:	PC Linux

Importance:	P2 blocker with 1 vote (vote)
Assignee:	Jesse Glick

URL:
Keywords:

Depends on:
Blocks:

Reported:	2008-08-15 13:27 UTC by mgoe
Modified:	2008-12-22 13:51 UTC (History)
CC List:	0 users

See Also:
Issue Type:	DEFECT
Exception Reporter:

Attachments
Source code for the example (364 bytes, text/plain) 2008-08-19 08:30 UTC, mgoe	Details
Jar file for the example (981 bytes, application/octet-stream) 2008-08-19 08:32 UTC, mgoe	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description mgoe 2008-08-15 13:27:21 UTC

In an application based on the netbeans platform I found an incompatible change while switching from netbeans 
6.5_m1-200807040101 to 6.5beta-200808111757.

The following code:

URL url = getClass().getProtectionDomain().getCodeSource().getLocation();
String jarFileName = url.getFile();

gives different results when using the 6.5m1 platform and the 6.5beta platform.

With the 6.5m1 platform jarFileName contains "/home/xxx/platform/build/cluster/modules/xxx-name.jar" (which I think is 
correct) while using the 6.5beta platform gives "file:/home/xxx/platform/build/cluster/modules/xxx-name.jar!/"

It is very annoying that each change of the platform breaks compatibility (there were similar problems while switching 
from 6.0 to 6.1 see bug 129772). I don't have an idea what is wrong with the code except for some undocumented 
netbeans classloader magic.

Comment 1 Jesse Glick 2008-08-18 23:33:27 UTC

The result in 6.5 beta is correct. The full URL should be

jar:file:/home/xxx/platform/build/cluster/modules/xxx-name.jar!/

Expecting URL.getFile to produce a file path is an error. (Consider e.g. "/tmp/some thing.jar" vs.
"file:/tmp/some%20thing.jar".) What you probably meant to do is call FileUtil.archiveOrDirForUrl.

Comment 2 mgoe 2008-08-19 08:29:39 UTC

Sorry, but you are wrong!

I just created a small self contained example (without netbeans) which shows that:

URL url = getClass().getProtectionDomain().getCodeSource().getLocation();
url.getFile();

returns a file path (like it was with 6.5m1). I tested with every jdk version from 1.4 to 1.7.
Please run the attached example:

java -jar jartest.jar

Does netbeans perform some classloader magic again?

Best regards,
Martin

Comment 3 mgoe 2008-08-19 08:30:47 UTC

Created attachment 67763 [details]
Source code for the example

Comment 4 mgoe 2008-08-19 08:32:02 UTC

Created attachment 67764 [details]
Jar file for the example

Comment 5 Jesse Glick 2008-08-20 18:43:19 UTC

Yes, the JRE's AppClassLoader uses an incorrect URL for JAR entries in the classpath (missing the "jar:" prefix and "!/"
suffix, and thus not denoting a folder URL); I doubt this will ever be fixed, for "compatibility" reasons. For directory
entries it is correct in both the JRE and NB (e.g. "file:/project/classes/"). Special launchers such as JNLP use other
URL protocols like http; again the JRE implementation does not consistently produce the correct 'jar'-protocol URL when
asked for a code source. (URLClassLoader accepts code source URLs with or without the 'jar'-protocol wrapper,
automatically inserting it if missing.)

In any case, your code is wrong even in a non-NB application for the reasons I gave: URL.file does not always correspond
to File.absolutePath. In fact on Windows it *never* corresponds, due to the different path separator, and the use of a
drive letter. On Unix it usually does, unless there are URL metacharacters in the file path. UNC paths are an added
complication. File(URI) and File.toURI are the only correct way to interconvert File's and URL's. (File.toURL is wrong,
which is why it is now deprecated.)

Comment 6 mgoe 2008-08-21 09:11:25 UTC

I just tried the following which should work if I understood your explanation correctly:

URL url = getClass().getProtectionDomain().getCodeSource().getLocation();
try {
    File jarFile = new File(url.toURI());
    String jarFileName = jarFile.getAbsolutePath();
} catch (URISyntaxException ex) {
    ex.printStackTrace();
}

It works fine in a non netbeans application but throws an IllegalArgumentException when the code is used from a 
netbeans application.

java.lang.IllegalArgumentException: URI is not hierarchical
	at java.io.File.<init>(Unknown Source)

In the non netbeans application (which works) the URI contains:
file:/home/xxx/jartest.jar

In the netbeans application (which does not work) the URI contains:
jar:file:/home/xxx/xxx.jar!/

Comment 7 Jesse Glick 2008-08-22 02:31:07 UTC

Which is why FileUtil.archiveOrDirForUrl (mentioned previously) is useful - it accepts either the incorrect JRE-type or
correct NB-style URL and produces the right File. In case this was not clear before:

file:/home/xxx/jartest.jar

refers to the raw byte contents of the JAR file (with no consideration for the fact that it is a ZIP at all);

jar:file:/home/xxx/jartest.jar!/

is a directory URL which cannot be opened as such but which can be suffixed with the name of a desired ZIP entry, e.g.

jar:file:/home/xxx/jartest.jar!/META-INF/MANIFEST.MF

which refers to the contents of that ZIP entry. A code source ought to be associated with a directory URL (i.e. URL
ending in "/"); the URL for an actual item loaded from that code source is formed by appending the URL encoding of the
Java resource path of the item to the code source URL.

Comment 8 anba 2008-08-22 06:18:18 UTC

The approach of netbeans to correct the problem of URL.getFile may break code of third party libraries. If a third party
library tuses this method and implemented a workaround for the problems the code will not work in netbeans. The netbeans
implementation means that you cannot use third party libraries that use URL.getFile. The authors of this libraries may
not want to change their code since then it will not work in non netbeans java applications. 
A software such as netbeans should never change the behavior of JRE methods.

Comment 9 Jesse Glick 2008-08-22 06:48:35 UTC

new File(url.getFile()) is broken code anyway. Any library which tried to do this would produce nonsensical results when
run from JNLP, for example, or when run from a directory with a path component containing a space.

Comment 10 anba 2008-08-22 07:06:51 UTC

I don't think the JDK version is broken. It does exactly what the javadoc says:

    Gets the file name of this URL. The returned file portion will be the same as getPath(), plus the concatenation of
the value of getQuery(), if any. If there is no query portion, this method and getPath() will return identical results.

and getPath says:

Gets the path part of this URL. 

It does not state that it returns an absolute path. It also does not state that it returns a valid path or a path you
can use. This could not be the case because with an http URL there does not exist a valid path. It only says that it
returns the path portion as string and thats what the JDK version does. The netbeans version breaks the contract. 

BTW: who says that anybody wants to use this method for the case mgoe uses it? Maybe a programmer uses it to create own
URLs by a user defined function or to simply print it in an error message. Such a code would be broken by this change. 

To make my point completley clear:
I think that it is a major error for a general purpose platform such as netbeans to change classes and methods of the
underlying standard library because this breaks code. 
It is a design error of java that SUN allows this for every class by simply implementing an own class loader.

I do not have a problem if you do this in your IDE but it should never be done in the platform.

Comment 11 Jesse Glick 2008-08-22 18:33:38 UTC

No, for a JAR URL "jar:file:/tmp/x.jar!/" the path portion of the URL is "file:/tmp/x.jar!/", which is what is returned
when you call URL.getFile(). The URL class is behaving normally and NetBeans is not making any changes to its behavior.
What is wrong is your assumption that URL.file is somehow meaningful as a java.io.File path - on occasion it can be, but
in general it will not. If you want to extract a File from a URL starting with "jar:file", you need to (1) strip "jar:"
and "!/", (2) call new File(URI); the FileUtil method mentioned before is the convenient way to do both at once, with
some additional error checking and also handling of directory URLs.

The NetBeans class loader is behaving within its contract - if you ask for the code source of a class, you get a URL to
which you can append a resource path visible to that class loader, and open the URL to retrieve that resource. Most
class loaders behave correctly in this respect as well, with the sole exception of URLClassLoader when given
"file:/tmp/x.jar" in its constructor (as e.g. the Sun JRE will do when the regular Java launcher is run with a JAR on
the classpath). URLClassLoader produces the correct code source for a directory CP entry (e.g. "file:/tmp/classes/"),
and will also accept the correct "jar:file:/tmp/x.jar!/" in its constructor and return that as the code source.
URLClassLoader does implement getResource correctly, producing a jar-protocol URL; its flaw is to do the translation
from the wrong input only in getResource and not also in codeSource. On the other hand, CodeSource.getLocation really
doesn't specify anything about what permissible values could be.

Comment 12 anba 2008-08-25 06:38:08 UTC

I am not really convinced. Can you please tell me what a programmer should do if the code must work inside and outside
netbeans? If the SUN JRE is consistent (and it seems to be for URLs created by
getClass().getProtectionDomain().getCodeSource().getLocation()) then the programmer must write a conversion for spaces
and for Windows (see the problems you (jglick) mentioned in previous messages). Then he tries to use the same code
inside netbeans and it compiles but suddenly it does not work anymore. There are several libraries not created for
netbeans and they do not even know (and need not to know) that in this case the URL.getFile creates a different result.
For those programmer using the FileUtil method is not an option (because it only exists in netbeans) and they do not
even think that they must convert the result in a different way as you (jglick) described in your previous message.

Comment 13 Jesse Glick 2008-08-25 16:12:44 UTC

To restate - you do not need special code to handle spaces or Windows paths. You simply need to use new File(URI) on a
file-protocol URL. new File(URL.getPath()) has always been incorrect code, independent of NetBeans.

FileUtil.archiveOrDirForURL is just a convenience method of a few lines which detects and correctly handles various URL
protocols. Libraries not written for NetBeans which need to translate a codebase URL to a classpath-style File can and
should include equivalent logic.

public static File archiveOrDirForURL(URL entry) {
    String u = entry.toString();
    if (u.startsWith("jar:file:") && u.endsWith("!/")) {
        return new File(URI.create(u.substring(4, u.length() - 2)));
    } else if (u.startsWith("file:")) {
        return new File(URI.create(u));
    } else {
        return null;
    }
}