Bug 54482 - HC fails to follow redirects with non-encoded chars
Summary: HC fails to follow redirects with non-encoded chars
Status: RESOLVED FIXED
Alias: None
Product: JMeter - Now in Github
Classification: Unclassified
Component: HTTP (show other bugs)
Version: 2.8
Hardware: All All
: P2 normal with 1 vote (vote)
Target Milestone: ---
Assignee: JMeter issues mailing list
URL:
Keywords:
Depends on: 54351
Blocks: 54142 54293
  Show dependency tree
 
Reported: 2013-01-24 21:24 UTC by shmulikk
Modified: 2013-08-08 11:59 UTC (History)
2 users (show)



Attachments
jmeter script (7.55 KB, application/octet-stream)
2013-01-24 21:25 UTC, shmulikk
Details
example jsp to reproduce problem (957 bytes, text/plain)
2013-01-24 21:26 UTC, shmulikk
Details
Screenshot of issue fixed (errors are due to HTTP 404 which is fine) (9.21 KB, image/gif)
2013-08-08 05:57 UTC, shmulikk
Details

Note You need to log in before you can comment on or make changes to this bug.
Description shmulikk 2013-01-24 21:24:33 UTC
When using HTTPClient and a response is redirecting to a url with special characters like [], the followed sample is failing with: Non HTTP response code: java.net.URISyntaxException

This doesn't occur with Java HTTP sampler.
Comment 1 shmulikk 2013-01-24 21:25:39 UTC
Created attachment 29894 [details]
jmeter script
Comment 2 shmulikk 2013-01-24 21:26:17 UTC
Created attachment 29895 [details]
example jsp to reproduce problem
Comment 3 Philippe Mouawad 2013-02-02 23:12:40 UTC
This seems regular from what I understand reading :
- http://www.ietf.org/rfc/rfc3986.txt

   A host identified by an Internet Protocol literal address, version 6
   [RFC3513] or later, is distinguished by enclosing the IP literal
   within square brackets ("[" and "]").  This is the only place where
   square bracket characters are allowed in the URI syntax.  In
   anticipation of future, as-yet-undefined IP literal address formats,
   an implementation may use an optional version flag to indicate such a
   format explicitly rather than rely on heuristic determination.



http://www.ietf.org/rfc/rfc1738.txt

    Unsafe:

    Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs. The characters "<" and ">" are unsafe because they are used as the delimiters around URLs in free text; the quote mark (""") is used to delimit URLs in some systems. The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character "%" is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`".

    All unsafe characters must always be encoded within a URL. For example, the character "#" must be encoded within URLs even in systems that do not normally deal with fragment or anchor identifiers, so that if the URL is copied into another system that does use them, it will not be necessary to change the URL encoding.



HC4 accepts this kind of URLs:
- http://stackoverflow.com/search?q=square+brackets+[url]

But not one where brackets are before '?'
Comment 4 shmulikk 2013-02-03 08:24:56 UTC
Hi Philippe,
I think you misread the text (or I am).

The first link describes IPv6 while I think we still commonly use IPv4.

The second link explicitly says: "All unsafe characters must always be encoded within a URL".

Third point is that Java implementation encode these characters (as you see from the example testplan).

Thanks for looking into that.
Comment 5 Philippe Mouawad 2013-02-03 21:06:38 UTC
Issue similar to 54142
Comment 6 Philippe Mouawad 2013-02-03 22:26:20 UTC
Date: Sun Feb  3 22:24:08 2013
New Revision: 1441978

URL: http://svn.apache.org/viewvc?rev=1441978&view=rev
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC4Impl.java
    jmeter/trunk/xdocs/changes.xml
Comment 7 Philippe Mouawad 2013-04-11 20:07:10 UTC
Date: Mon Feb  4 20:19:36 2013
New Revision: 1442330

URL: http://svn.apache.org/viewvc?rev=1442330&view=rev
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Apply fix to HTTPHC3Impl
Factor out sanitize code in ConversionUtils
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC3Impl.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC4Impl.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/util/ConversionUtils.java
Comment 8 Philippe Mouawad 2013-04-11 20:17:33 UTC
Fix introduces regressions on this kind of URLs
http://XXX.XXXX.com/toto_titi_tata/CatalogData/ItemImages\IJ\Items_1152\07_06_015_Na_0_0_0_Aucune_0_Na_Na_Batik_s.jpg
Comment 9 Philippe Mouawad 2013-04-11 20:25:49 UTC
Date: Thu Apr 11 20:24:48 2013
New Revision: 1467074

URL: http://svn.apache.org/r1467074
Log:
Rollback to fix of bugs:
54482- HC fails to follow redirects with non-encoded chars
54293- JMeter rejects html tags '&lt;' in query params as invalid when they are accepted by the browser
54142- HTTP Proxy Server throws an exception when path contains "|" character

Bugzilla Id: 54482,54293,54142

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC3Impl.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC4Impl.java
    jmeter/trunk/xdocs/changes.xml
Comment 10 Philippe Mouawad 2013-08-03 23:11:17 UTC
It should work now with Nightly build and HttpClient 4.

Could you give it a try ?
Thanks
Comment 11 shmulikk 2013-08-04 08:00:46 UTC
Hi,
Using r1509934, I still get this error with either HC3.1 and HC4.
HC3.1 shows:
Response code: Non HTTP response code: java.lang.IllegalArgumentException
Response message: Non HTTP response message: Invalid uri 'http://localhost:8080/?[]!@#$%^&*()': Invalid query


HC4 shows:
Response code: Non HTTP response code: java.net.URISyntaxException
Response message: Non HTTP response message: Malformed escape pair at index 29: http://localhost:8080/?[]!@#$%^&*()

Java works.
Comment 12 Philippe Mouawad 2013-08-06 21:31:38 UTC
Date: Tue Aug  6 21:30:55 2013
New Revision: 1511125

URL: http://svn.apache.org/r1511125
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC3Impl.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC4Impl.java
    jmeter/trunk/xdocs/changes.xml
Comment 13 Sebb 2013-08-06 22:42:26 UTC
ConversionUtils.sanitizeUrl() only works properly for URLs that are not currently encoded.

Encoded URLs - i.e. ones that contain %xx - are re-encoded to %25xx.
This is obviously wrong.

If a browser encodes some characters and not others, then either one has to unencode and re-encode, or one has to just encode the special characters.
Comment 14 Sebb 2013-08-07 13:47:01 UTC
(In reply to Sebb from comment #13)
> ConversionUtils.sanitizeUrl() only works properly for URLs that are not
> currently encoded.
>
> Encoded URLs - i.e. ones that contain %xx - are re-encoded to %25xx.
> This is obviously wrong.

Note: % is allowed to occur in a query, so does not double-encoded if present in the query string.
 
> If a browser encodes some characters and not others, then either one has to
> unencode and re-encode, or one has to just encode the special characters.
Comment 15 Philippe Mouawad 2013-08-07 21:16:57 UTC
Date: Wed Aug  7 21:05:29 2013
New Revision: 1511488

URL: http://svn.apache.org/r1511488
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Fix as per sebb comment
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/util/ConversionUtils.java

Date: Wed Aug  7 21:14:42 2013
New Revision: 1511500

URL: http://svn.apache.org/r1511500
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Oups take into account new exceptions
Make error message more complete
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC3Impl.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC4Impl.java
Comment 16 Philippe Mouawad 2013-08-07 21:19:52 UTC
Reading my comment from 2013-02-02 23:12:40 UTC, I wonder why I started working on this issue :-) 

Because:
'All unsafe characters must always be encoded within a URL' states that the kind of URLs to which you are redirecting is wrong according to RFC.


Anyway, @Shmulikk , could you give it a try now ? 

@Sebb, should we add a property to enable this URL fixing and make old behaviour the default ? I am afraid of side effects.
Comment 17 Philippe Mouawad 2013-08-07 21:31:47 UTC
Date: Wed Aug  7 21:30:54 2013
New Revision: 1511504

URL: http://svn.apache.org/r1511504
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Check for null path
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/util/ConversionUtils.java
Comment 18 Philippe Mouawad 2013-08-07 21:33:38 UTC
Date: Wed Aug  7 21:32:41 2013
New Revision: 1511506

URL: http://svn.apache.org/r1511506
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Add some tests
Bugzilla Id: 54482

Modified:
    jmeter/trunk/test/src/org/apache/jmeter/protocol/http/util/TestHTTPUtils.java
Comment 19 shmulikk 2013-08-08 05:55:17 UTC
Philippe, you should read the comment I made on 2013-02-03 08:24:56 UTC.
Basically the case here is that sometimes applications do not follow the rules/specifications.
To support such applications (I think JMeter should support broken applications too, as sometimes the testers will be stuck with such applications), we talked about the need to simply encoded re-direct URLs in case the need to be encoded according to the rules.

Regarding the fix, I have tested and it works now.
It behaves a bit differently than the JAVA HTTP Sampler but it is ok now.
I have attached another screenshot showing that on HC the url is encoded as expected while on JAVA (I assume) the URL is encoded but on the GUI it shows the decoded string.
Comment 20 shmulikk 2013-08-08 05:57:52 UTC
Created attachment 30714 [details]
Screenshot of issue fixed (errors are due to HTTP 404 which is fine)
Comment 21 shmulikk 2013-08-08 06:01:04 UTC
Sorry, my bad - I think this issue is not fixed yet.
As you can see the encoded string is broken.
It should be: "?%5B%5D!%40%23%24%25%5E%26*()"
But it is: "?%5B%5D%21%40"
Comment 22 Philippe Mouawad 2013-08-08 06:56:55 UTC
Do yoy have the same results with HC31 and HC4 ?
Comment 23 shmulikk 2013-08-08 06:59:28 UTC
Yes, as shown in the attached screenshot.
Comment 24 Philippe Mouawad 2013-08-08 09:37:30 UTC
Hello,
In fact it is a feature, remember this bug you submitted:
- https://issues.apache.org/bugzilla/show_bug.cgi?id=54351

# is a fragment so everything after it is stripped, this is OK for JMeter as the URL will be called without fragment which is only used on client side by browser to scroll to identifier.
Comment 25 Philippe Mouawad 2013-08-08 09:42:39 UTC
Date: Thu Aug  8 09:41:51 2013
New Revision: 1511654

URL: http://svn.apache.org/r1511654
Log:
Bug 54482 - HC fails to follow redirects with non-encoded chars
Clarify javadocs
Bugzilla Id: 54482

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jme
Comment 26 shmulikk 2013-08-08 11:59:08 UTC
Correct. Cool!
Thanks.
Comment 27 The ASF infrastructure team 2022-09-24 20:37:52 UTC
This issue has been migrated to GitHub: https://github.com/apache/jmeter/issues/3048