50379 – get task does not last part of url when redirected

Bug 50379 - get task does not last part of url when redirected

Summary: get task does not last part of url when redirected

Status:	NEW

Alias:	None

Product:	Ant
Classification:	Unclassified
Component:	Core tasks (show other bugs)
Version:	1.8.2
Hardware:	PC Windows XP

Importance:	P2 normal (vote)
Target Milestone:	---
Assignee:	Ant Notifications List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2010-11-30 05:42 UTC by Michael Osipov
Modified:	2010-12-27 11:11 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Michael Osipov 2010-11-30 05:42:49 UTC

Say you have this redirect url: https://repository.apache.org/service/local/artifact/maven/redirect?r=releases&g=org.apache.geronimo.ext.tomcat&a=tomcat-parent-6.0.26&v=6.0.26.0&e=tar.gz&c=source-release

It is redirected to: https://repository.apache.org/content/repositories/releases/org/apache/geronimo/ext/tomcat/tomcat-parent-6.0.26/6.0.26.0/tomcat-parent-6.0.26-6.0.26.0-source-release.tar.gz

The documentation of this task says: The destination file name use the last part of the path of the source URL unless you also specify a mapper.

This is ignored in the redirect case. The output name is "redirect"

Comment 1 J.M. (Martijn) Kruithof 2010-12-01 14:15:15 UTC

The source URL is not the URL which is redirected to, the source URL is the URL provided, which happens to be redirect in your case, generally everything after the ? is seen as request parameters.

The URL which is redirected to is generally called the target URL.

Note it would be unsafe to actually use anything a remote system provides as "filename" as part of the URL after a redirect.
http://www.example.com/c:\windows\regedit.exe is an example of a valid URL and c:\windows\regedit.exe will be the last part of the URL and hence the filename.

Comment 2 Michael Osipov 2010-12-01 14:26:11 UTC

Martijn,
doesn't this risk exist without redirection too? It is always in the responsibility of the script writer to verify the sources he's downloading from. Such redirecting services, like in Nexus, have legitimate existance.

Comment 3 J.M. (Martijn) Kruithof 2010-12-01 17:09:12 UTC

Hi Michael

If you have explicitly specified the URL this risk does not exist, as the user has specified the URL, and therefore knows (or at least could know) what file name will be used. If you have built up the collection of URL's by crawling a remote site, yes you better verify them before using them this way.
The following example

<get dest="downloads">
  <url url="http://ant.apache.org/index.html"/> 
  <url url="http://ant.apache.org/faq.html"/>
</get>

will always store the files as downloads/index.html and downloads/faq.html, even if we decided to move those files and retrieve them using a redirect with an url like for instance http://ant.apach.org/manual?page=index and http://ant.apache.org/manual?page=faq

If we would manipulate the name according to the redirect both files would be stored as downloads/manual

So yes without redirection the risk exists if a user builds up an collection of URL's by for instance crawling a remote site, but not in case the user explicitly stated the resources (s)he wanted to retrieve. Starting to rename the download would expose these users to this risk, to which they are currently not exposed (and would break builds if someone redirects in a way unexected to the script).

Comment 4 Michael Osipov 2010-12-02 05:23:35 UTC

Martijn, in your example there would be only one manual file. The second would overwrite the first one. This is more harmful than my basic idea. The old behavior could be retained and an additional parameter could be introduced with uses the last redirected name.