Bug 47433 - Issue with "get" task to download a redirected/moved URL (301/302)
Summary: Issue with "get" task to download a redirected/moved URL (301/302)
Status: REOPENED
Alias: None
Product: Ant
Classification: Unclassified
Component: Core tasks (show other bugs)
Version: 1.8.2
Hardware: All All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: Ant Notifications List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-25 18:11 UTC by Jagadesh Munta
Modified: 2014-01-06 15:11 UTC (History)
0 users



Attachments
get task that seizes control of redirection itself allowing a redirect from HTTP to HTTPS (12.13 KB, patch)
2009-06-29 13:57 UTC, J.M. (Martijn) Kruithof
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jagadesh Munta 2009-06-25 18:11:39 UTC
Email thread sent to the dev@ant.apache.org alias.
Jagadesh Babu Munta wrote:
> Changes again as highlighted (missed couple of lines in earlier email):-
>  
>           int responseCode = httpConnection.getResponseCode();
>             log("Response Code="+responseCode, logLevel);
>             // test for 401 result (HTTP only)
>             if (responseCode == HttpURLConnection.HTTP_UNAUTHORIZED)  {
>                 String message = "HTTP Authorization failure";
>                 if (ignoreErrors) {
>                     log(message, logLevel);
>                     return false;
>                 } else {
>                     throw new BuildException(message);
>                 }
>             } else if ((responseCode == HttpURLConnection.HTTP_MOVED_PERM) ||
>                     (responseCode == HttpURLConnection.HTTP_MOVED_TEMP)) {
>                 String newLocation = httpConnection.getHeaderField("Location");
>                 String message = "HTTP URL Moved to "+newLocation;
>                 log(message, logLevel);
>                 setSrc(new URL(newLocation));
>                 execute();
>             }
>
>         }
>
>
> Jagadesh Babu Munta wrote:
>> Hi,
>>
>> I am trying use ANT task "get" to download a redirectored or moved URL "http://hudson-ci.org/latest/hudson.war"
>> But it doesn't work. It simply get the moved HTML page than the actual redirected URL. Verified with 1.7.1/1.6.5 versions.
>>
>> I wonder if any workaround?
>>
>> In fact, I looked at the Get.java taskdef source code and found that code is not taking care of 301/302 HTTP response code.
>>
>> I have added a simple code (highlighted) in Get.java  and worked fine.
>>
>> Can some one help me getting the fix into the ant code (or) fixed already (or) if any workaround in the latest (1.7.1) or 1.6.5 ANT bits?
>> (I joined Today only to this list. Sorry if any one has already discussed the issue.
>>
>>             if (responseCode == HttpURLConnection.HTTP_UNAUTHORIZED)  {
>>                 String message = "HTTP Authorization failure";
>>                 if (ignoreErrors) {
>>                     log(message, logLevel);
>>                     return false;
>>                 } else {
>>                     throw new BuildException(message);
>>                 }
>>             } else if ((responseCode == HttpURLConnection.HTTP_MOVED_PERM) ||
>>                     (responseCode == HttpURLConnection.HTTP_MOVED_TEMP)) {
>>                 String newLocation = httpConnection.getHeaderField("Location");
>>                 String message = "HTTP URL Moved to "+newLocation;
>>                 log(message, logLevel);
>>                 setSrc(new URL(newLocation));
>>                 execute();
>>             }
>>
>>         }
>>
>>
>>
>>
>> Snapshot (problematic):-
>> ------------
>>
>>
>>      [get] Getting: http://hudson-ci.org/latest/hudson.war
>>       [get] To: /Users/munta/runtests/v3/appserver-sqe/build/pe/i386_dhcp-usca14-133-126.SFBay.Sun.COM_Darwin/hudson/archive/hudson.war
>>       [get] ..
>>       [get] last modified = Wed Dec 31 16:00:00 PST 1969 - using current time instead
>>
>> BUILD SUCCESSFUL
>> Total time: 2 seconds
>> [dhcp-usca14-133-126:v3/appserver-sqe/hudson] munta% cat /Users/munta/runtests/v3/appserver-sqe/build/pe/i386_dhcp-usca14-133-126.SFBay.Sun.COM_Darwin/hudson/archive/hudson.war
>> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
>> <HTML><HEAD>
>> <TITLE>302 Found</TITLE>
>> </HEAD><BODY>
>> <H1>Found</H1>
>> The document has moved <A HREF="https://hudson.dev.java.net/files/documents/2402/137470/hudson.war">here</A>.<P>
>> <HR>
>> <ADDRESS>Apache/1.3.33 Server at hudson-ci.org Port 80</ADDRESS>
>> </BODY></HTML>
>> [dhcp-usca14-133-126:v3/appserver-sqe/hudson] munta%
>>
>> ---
>>
>>
>> Snapshot (worked fine with above code change)
>> ---------
>>
>>       [get] Getting: http://hudson-ci.org/latest/hudson.war
>>       [get] To: /Users/munta/runtests/v3/appserver-sqe/build/pe/i386_dhcp-usca14-133-126.SFBay.Sun.COM_Darwin/hudson/archive/hudson.war
>>       [get] Response Code=302
>>       [get] HTTP URL Moved to https://hudson.dev.java.net/files/documents/2402/137470/hudson.war
>>       [get] Getting: https://hudson.dev.java.net/files/documents/2402/137470/hudson.war
>>       [get] To: /Users/munta/runtests/v3/appserver-sqe/build/pe/i386_dhcp-usca14-133-126.SFBay.Sun.COM_Darwin/hudson/archive/hudson.war
>>       [get] Response Code=200
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] ....................................................
>>       [get] .............................
>>       [get] last modified = Tue Jun 23 13:57:19 PDT 2009
>>       [get] ..
>>       [get] last modified = Wed Dec 31 16:00:00 PST 1969 - using current time instead
>>
>> BUILD SUCCESSFUL
>> Total time: 39 seconds
>>
>>
>> Thanks for your time and help.
>> -- Jagadesh
Comment 1 Jagadesh Munta 2009-06-26 18:39:10 UTC
Earlier code (calling execute after setting new URL) had issue on Linux and modified to call doGet() method.

Below is the latest code (last line changed).

            int responseCode = httpConnection.getResponseCode();
            log("Response Code="+responseCode, logLevel);
            // test for 401 result (HTTP only)
            if (responseCode == HttpURLConnection.HTTP_UNAUTHORIZED)  {
                String message = "HTTP Authorization failure";
                if (ignoreErrors) {
                    log(message, logLevel);
                    return false;
                } else {
                    throw new BuildException(message);
                }
            } else if ((responseCode == HttpURLConnection.HTTP_MOVED_PERM) ||
                    (responseCode == HttpURLConnection.HTTP_MOVED_TEMP)) {
                String newLocation = httpConnection.getHeaderField("Location");
                String message = "HTTP URL Moved to "+newLocation;
                log(message, logLevel);
                setSrc(new URL(newLocation));
                return doGet(logLevel, progress);
            }
Comment 2 J.M. (Martijn) Kruithof 2009-06-27 04:53:51 UTC
I have adapted the patch to avoid redirect loops etc, furthermore I refactored the code to reach a cleaner solution.

Test show however that redirects are followed by default. This is also the default setting of the HttpURLConnection.

When researching why the redirect does not work for the hudson download I found that the HttpURLConnection does not switch protocols. The Sun Java Networking Engineers reached the conclusion it would be unsafe to switch protocols, this has been listed in:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4620571

Before I progress applying the changes, I would like to pose the question whether we would want this behavior in ant, even if the Sun Networking Engineers concluded this would be unsafe. 

For this reason, personally I would tend to say no to the patch with the protocol change. If Sun deems this change to be safe, this change should be applied in the HttpURLConnection, I could imagine that a switch from http to https could be allowed, while a change in the other direction is not. Switches to yet other protocols pose completely new questions.
If such change is deemed safe, the change should probably be made in the HttpURLConnection.
Comment 3 Stefan Bodewig 2009-06-29 02:44:34 UTC
I agree with Martijn.

If <get> followed redirect that Java wouldn't follow on its own it shouldn't do so without the user explicitly asking for it (i.e. with a new attribute to enable it).
Comment 4 Jagadesh Munta 2009-06-29 11:05:24 UTC
Thanks Martin and Stefan for quickly looking into this.

I looked at the jdk bug, "http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4620571"
and it says at the end "Check response code and Location header field value for redirect information. It's the application's responsibility to follow the redirect."

I think we should take care of the same in the ANT for the above issue because it is an application. Because having a solution to the problem is better than nothing! 

I agree that it is better to have an optional attribute to have this new behavior so that by default it simply follows HttpURLConnection and on demand (when no security issue) user can use this new attribute.
Comment 5 J.M. (Martijn) Kruithof 2009-06-29 13:15:56 UTC
If it is not safe in the platform, it is not safe in an application that cannot, and will not pop up an indication >>>are you sure<<< (and probably not even when it would pop up an indication it would still not be safe)

It does not get any safer by doing this from an application, instead of from the java platform, it just becomes the responsibility of the application if it does follow the redirect. 

If it is unsafe to switch protocols from a redirect, we should not be doing this unless we do understand the implications.
Comment 6 J.M. (Martijn) Kruithof 2009-06-29 13:57:23 UTC
Created attachment 23901 [details]
get task that seizes control of redirection itself allowing a redirect from HTTP to HTTPS
Comment 7 J.M. (Martijn) Kruithof 2009-06-29 14:03:08 UTC
D:\data\eclipseworkspace\ant-trunk\testje>..\dist\bin\ant -file testget.xml
Buildfile: D:\data\eclipseworkspace\ant-trunk\testje\testget.xml

bug:
      [get] Getting: http://hudson-ci.org/latest/hudson.war
      [get] To: D:\data\eclipseworkspace\ant-trunk\testje\dl2.html
      [get] http://hudson-ci.org/latest/hudson.war moved to https://hudson.dev.java.net/files/documents/2402/137602/hudson.war
      [get] Getting: http://kruithof.xs4all.nl/uuid/uuidgen
      [get] To: D:\data\eclipseworkspace\ant-trunk\testje\dl.html
      [get] http://kruithof.xs4all.nl/uuid/uuidgen permanently moved to http://www.famkruithof.net/uuid/uuidgen

Is there any way we can set up redirects on our webserver to allow official testcases?
Comment 8 Jagadesh Munta 2009-06-29 15:32:50 UTC
Not sure who is the administrator on ant.apache.org. I don't know the location of tests too;)
If you are looking for information on how to do the same, see at
http://httpd.apache.org/docs/1.3/mod/mod_alias.html
There is forum question as sample - 
http://www.webmasterworld.com/forum92/151.htm
Comment 9 J.M. (Martijn) Kruithof 2009-06-29 21:00:43 UTC
Jagadesh,

could you inquire within Sun why it has been chosen not to allow a redirect between protocols in the HttpURLConnection in general, and why not from http to https (in that direction) in particular?

Because still if it is not safe we should not do it.
And if it is safe, why shouldn't it be done in the HttpURLConnection instead of within ant?
Comment 10 Jagadesh Munta 2009-06-29 22:24:50 UTC
Martijn,
Ok. I will inquire about HttpURLConnection and update once I get the details.

About the ANT, I still think it is an application w.r.t Java (platform) and also the usage for ANT is different than Java, where in java secured applications might be developed. With ANT I think, the usage is under control with the users.

Thanks.
Comment 11 Jagadesh Munta 2009-07-07 11:15:45 UTC
Martijn,

The following are the responses that I got from JDK folks:- 

--- 

Michael McMahon wrote:
> Another problem is the API itself.
>
> HttpsURLConnection is a sub-class of HttpURLConnection and when you open
> a URL, you get one or the other depending on the original URL type. But you can't change
> the object's type, after a redirect.
>
> So, the only way to do redirection between http and https is to disable automatic handling
> and do it manually in the application.
>
> - Michael.
>
> Jean-Christophe Collet wrote:
>> From the top of my head (it was a long time ago), the HTTP specs used for HttpURLConnection at the specified that automatic redirection should only occur when the protocol stayed the same.  i.e. from http to http, or https to https.
>> This is pretty obvious in the case of forbidding https to http redirection, a bit less the other way around.
>> Basically that kind of change should not be transparent to the user or/and application.
>> So the decision to allow this or not had to be differed to the application.
>>
>> After that, it's a matter of backward compatibility.

-----
Comment 12 J.M. (Martijn) Kruithof 2009-07-10 10:38:10 UTC
I have created and applied a patch to allow a redirect from http to https, but no other protocol switches.
Comment 13 J.M. (Martijn) Kruithof 2009-07-10 10:38:55 UTC
svn revision: http://svn.apache.org/viewvc?rev=793048&view=rev
Comment 14 John Mc Quillan 2013-08-20 18:58:28 UTC
Seeing this problem on 1.8.2

Got a redirect and the file was not downloaded - instead this error message is stored in the file

If you are not automatically redirected use this url: https://repository.sonatype.org/service/local/repositories/central-proxy/content/org/apache/ivy/ivy/2.3.0/ivy-2.3.0.jar
Comment 15 Stefan Bodewig 2014-01-06 15:11:52 UTC
John, could this be bug 54374 rather than this one?  Any chance you could give 1.9.x a try?