Bug 54142 - HTTP Proxy Server throws an exception when path contains "|" character
Summary: HTTP Proxy Server throws an exception when path contains "|" character
Status: RESOLVED FIXED
Alias: None
Product: JMeter - Now in Github
Classification: Unclassified
Component: HTTP (show other bugs)
Version: 2.8
Hardware: All All
: P2 major with 1 vote (vote)
Target Milestone: ---
Assignee: JMeter issues mailing list
URL:
Keywords:
Depends on: 54482
Blocks:
  Show dependency tree
 
Reported: 2012-11-13 12:01 UTC by Marek
Modified: 2013-08-07 21:27 UTC (History)
1 user (show)



Attachments
Result of recording when exception was raised (6.85 KB, application/octet-stream)
2012-11-13 12:01 UTC, Marek
Details
path to fixing the unwise characters problem (1.67 KB, patch)
2012-12-10 14:43 UTC, Marek
Details | Diff
Corrected patch - double encoding problem (2.15 KB, patch)
2012-12-11 10:34 UTC, Marek
Details | Diff
Patch for issue fixing unsafe URLs during parsing (1.76 KB, patch)
2013-08-03 22:50 UTC, Philippe Mouawad
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marek 2012-11-13 12:01:17 UTC
Created attachment 29595 [details]
Result of recording when exception was raised

I've have some application to test which is working fine, but during recording process with JMeter when I select some link, proxy rises an exception (visible only in web browser):

java.net.URISyntaxException: Illegal character in path at index 51: http://10.133.47.78:8080/comarch-cm/cm/showAccounts|C|21737.treeGroupNode|G|21731.treeGroupNode|G|21691.?clickContext=tree
	at java.net.URI$Parser.fail(Unknown Source)
	at java.net.URI$Parser.checkChars(Unknown Source)
	at java.net.URI$Parser.parseHierarchical(Unknown Source)
	at java.net.URI$Parser.parse(Unknown Source)
	at java.net.URI.<init>(Unknown Source)
	at java.net.URL.toURI(Unknown Source)
	at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:232)
	at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:62)
	at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1075)
	at org.apache.jmeter.protocol.http.proxy.Proxy.run(Proxy.java:212)


I'm including a short version of recording.

Below is the content of problematic GET request from "View Results Tree" attached to "HTTP Proxy Server" (HTTP view):
	Method  	GET
	Protocol	http
	Host    	10.133.47.78
	Port    	8080
	Path    	/comarch-cm/cm/showAccounts|C|21737.treeGroupNode|G|21731.treeGroupNode|G|21691.

	clickContext	tree

Looks like "|" character character is problematic here.
Comment 1 Marek 2012-11-13 13:23:09 UTC
Ok. I found something like that: http://tools.ietf.org/html/rfc2396#section-2.4.3
So "|" character is classified as "unwise"! Does it mean that JMeter should treat this character as incorrect? I'm not convinced.

Anyway I will report problem to application owners.
Comment 2 Philippe Mouawad 2012-11-17 15:32:46 UTC
"|" seems to be an unsafe character and should have been encoded in your case , read http://www.faqs.org/rfcs/rfc1738.html,:
   "Other characters are unsafe because
   gateways and other transport agents are known to sometimes modify
   such characters. These characters are "{", "}", "|", "\", "^", "~",
   "[", "]", and "`".

   All unsafe characters must always be encoded within a URL."
Comment 3 Philippe Mouawad 2012-11-17 15:53:35 UTC
Java uses RFC2396 to check URL.
In this case, | is not valid.
Comment 4 Marek 2012-11-19 09:00:22 UTC
Hi,

I understand that this character is classified as "unwise" ("unsafe"), and all new web application must not use it.
On other hand there are some old web application which are using that character in path (like mine).
IMO JMeter should support old web applications, which do not respect this rule. Instead throwing an exception and making recording of test case impossible (or extremely complicated) it should do something else like:
   - escape those problematic characters (modify request)
   - or ignore the problem (add some warnings)

I can understand that you have different opinion and that is why decided to mark report as "WONTFIX", but can you at least give me a general hint how/where can I fix it? This way I could create my own private branch of JMeter.
Comment 5 Marek 2012-11-19 10:34:53 UTC
Small update in case if someone else accouter same problem.
This issue is not related to JMeter or web application.
It is web browser problem.

I've encounter this problem when using Fire Fox.
When I've tried Internet Explorer the problematic "|" character was replaced with "%7C" and recording of test was successful.
Comment 6 Philippe Mouawad 2012-11-19 10:38:18 UTC
Thanks for feedback.

Regarding your previous comment, we usually do what you describe and tend to help as much as possible users with workarounds.
But in this case I had no idea :-)

Regards
Philippe
Comment 7 Marek 2012-11-19 11:39:15 UTC
Thanks.
Under IE problem reappear in later stage, when JavaScript with this character kick in, so I have to find workaround.
If I find it, then I will post it here.
Comment 8 Sebb 2012-11-19 12:07:37 UTC
I can confirm that FireFox fails to encode URLs containing | when entered in the location field. IMO that is a bug in Firefox.

However Opera, Chrome and IE do encode the "|" character when entered in the location field.

==

The Proxy intercepts the outgoing request, which should already have been encoded.

In general it's not possible to safely re-encode a URL, because the encoding process introduces % characters which themselves need encoding.

For example /| should be presented as /%7C.

If this is re-encoded, it will encode the % again.
Comment 9 Marek 2012-12-10 14:43:10 UTC
Created attachment 29741 [details]
path to fixing the unwise characters problem
Comment 10 Marek 2012-12-10 14:48:35 UTC
Hi,

I've included patch which fixes this issue.
Please review that and consider to release that to trunk.
Fix is very simple.
Root cause of the problem is in behavior of java.net.URL.toURI() method which throws exception when simple encoding of forbidden/unwise character could do the job.
Patch doesn't contain new test cases to cover corrected functionality.

BR,

Marek Ruszczak
Comment 11 Marek 2012-12-11 10:34:20 UTC
Created attachment 29742 [details]
Corrected patch - double encoding problem

Hi,

I've noticed that there is problem with encoding url.
For example FireFox will encode spaces but will not encode unwise characters  ("|").
So there is danger of encoding something which was already encoded. To overcome this problem I'm first decoding uri (path and query parts) and then encode everything (including unwise characters).
Choice of encoding (UTF-8/ANSI-ASCII/...) is problematic, so defensively someone should review my patch.

I've notice that this also can be fixed in: org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.getUrl
it might be better place to fix it but for me it is hard to evaluate.

BR,

Marek
Comment 12 Sebb 2012-12-11 11:45:59 UTC
The problem only arises when using the Proxy Server.

However the patch affects all usage of the HC4 implementation. I think this is wrong:
- we should not mess with existing test plans using HC4
- it may not work for other HTTP implementations
- it's unnecessary work for most samples

Any patch that is applied needs to be for the Proxy Server only.

But I do agree that decoding and then re-encoding the URL should avoid the double-encoding issue as mentioned in comment 8.

It would be sensible to check if the URL is valid before attempting to fix it, rather than unconditionally "fixing" each URL. This would avoid unnecessary conversions possibly changing the URL.

This would be slightly more work for the invalid case - but the work is done by the Proxy Server, not as part of a test - and is the minimal work required to work round what is a browser bug.
Comment 13 Philippe Mouawad 2012-12-17 07:39:21 UTC
You should try to apply your fix in DefaultSamplerCreator#computePath() method.
Comment 14 Philippe Mouawad 2013-02-04 22:04:40 UTC
Fixed as part of 54482 fix
Comment 15 Philippe Mouawad 2013-02-04 22:05:49 UTC
Date: Mon Feb  4 22:05:10 2013
New Revision: 1442395

URL: http://svn.apache.org/viewvc?rev=1442395&view=rev
Log:
Bug 54142 - HTTP Proxy Server throws an exception when path contains "|" character
Bugzilla Id: 54142

Modified:
    jmeter/trunk/xdocs/changes.xml
Comment 16 Marek 2013-02-05 11:32:39 UTC
I've tested it and it works perfectly.
Thanks a lot.

BR,

Marek

PS. I'm not sure if I should change status to VERIFIED/FIXED so I leave it as it is.
Comment 17 Philippe Mouawad 2013-04-11 20:20:13 UTC
Fix introduces regressions on this kind of URLs
http://XXX.XXXX.com/toto_titi_tata/CatalogData/ItemImages\IJ\Items_1152\07_06_015_Na_0_0_0_Aucune_0_Na_Na_Batik_s.jpg
Comment 18 Philippe Mouawad 2013-04-11 20:26:06 UTC
Date: Thu Apr 11 20:24:48 2013
New Revision: 1467074

URL: http://svn.apache.org/r1467074
Log:
Rollback to fix of bugs:
54482- HC fails to follow redirects with non-encoded chars
54293- JMeter rejects html tags '&lt;' in query params as invalid when they are accepted by the browser
54142- HTTP Proxy Server throws an exception when path contains "|" character

Bugzilla Id: 54482,54293,54142

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC3Impl.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/sampler/HTTPHC4Impl.java
    jmeter/trunk/xdocs/changes.xml
Comment 19 Philippe Mouawad 2013-08-03 22:50:40 UTC
Created attachment 30670 [details]
Patch for issue fixing unsafe URLs during parsing

Patch proposition which escapes IllegalURLCharacters if browser does not.
The idea is to convert to URI the URL, if it fails then it means it contains unsafe characters, then they are fixed.

My issue with this patch is that it fixes the issues for HttpClient3 and 4 but it also fixes for Java which could break
Comment 20 Sebb 2013-08-07 00:44:14 UTC
(In reply to Philippe Mouawad from comment #19)
> Created attachment 30670 [details]
> Patch for issue fixing unsafe URLs during parsing
> 
> Patch proposition which escapes IllegalURLCharacters if browser does not.
> The idea is to convert to URI the URL, if it fails then it means it contains
> unsafe characters, then they are fixed.

Seems OK. 
The conversion method could be added to ConversionUtils.
It needs some test cases.

> My issue with this patch is that it fixes the issues for HttpClient3 and 4
> but it also fixes for Java which could break

Not sure I follow.

The purpose of this patch is to fix up URLs generated by browsers that don't encode all unsafe characters. It should be equivalent to using a well-behaved browser currently.
Comment 21 Philippe Mouawad 2013-08-07 21:27:19 UTC
Date: Wed Aug  7 21:25:45 2013
New Revision: 1511503

URL: http://svn.apache.org/r1511503
Log:
Bug 54142 - HTTP Proxy Server throws an exception when path contains "|" character
Integrated my patch with a slight change to make current behaviour with Java Impl remain the same as bug only affects HC impls
Bugzilla Id: 54142

Modified:
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/proxy/HttpRequestHdr.java
    jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/util/ConversionUtils.java
    jmeter/trunk/xdocs/changes.xml
Comment 22 The ASF infrastructure team 2022-09-24 20:37:52 UTC
This issue has been migrated to GitHub: https://github.com/apache/jmeter/issues/2978