Summary: | Escaped ampersand characters are not unescaped when URL's are visited | ||
---|---|---|---|
Product: | JMeter - Now in Github | Reporter: | Jonathan Morace <jmorace> |
Component: | Main | Assignee: | JMeter issues mailing list <issues> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P3 | ||
Version: | 2.2 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Windows XP | ||
Attachments: | Patch for using & instead of & in URL |
Description
Jonathan Morace
2006-09-22 21:24:52 UTC
Created attachment 19643 [details]
Patch for using & instead of & in URL
Attached is a patch against
svn.apache.org/repos/asf/jakarta/jmeter/branches/rel-2-2 as of today.
It replaces all & in the URL for an embedded resource with &.
This is needed to be able to test valid xhtml pages, because xhtml will use
& in the href attribute of the a tag. But when the browser or in this case
Jmeter uses that URL, it must use & and not &
I think it is important to fix this, since more and more sites are using proper xhtml. The suggested patch seems unproblematic to me. Do you think the suggested patch is wrong, or are you afraid of any side effects ? Any input on the patch, or suggestions on how to solve this are welcomed. It does not seem like it will have side effects, as it only applies to downloadable resources. However, I'm not sure that this is the correct place to fix the problem - it seems odd to be encoding spaces yet decoding ampersands. I think it should probably be fixed where the URLs are extracted. If you try the following HTML in your browser : <html> <head> <title>test</title> </head> <body> <p>A test <a href="http://www.google.com/test?somekey=some value&someotherkey=some%20value%20indeed">link</a></p> </body> </html> and click the link, you will see that the browser then tries to fetch this url : http://www.google.com/test?somekey=some%20value&someotherkey=some%20value%20indeed This is the URL you will see in the "Address" field in the browser. If you want to fix this where the URLs are extracted, do you then mean in the HTMLParser class ? I.e. to perhaps add a protected method in HTMLParser "protected String getExecutableUrl(String anUrl)", which all of the sub classes of HTMLParser should call before putting an URL into the URLCollection they are building ? The "getExecutableUrl", I wish I had a better name, method would then take care of encoding spaces and decoding "&" HTML entities. I guess the encoding of spaces is done just to support incorrect HTML written by a lot of people. The decoding of "&" HTML entities is something which is correct to do, in my opinion. Or do you want to change URLCollection or URLString ? Advices are welcomed Since the embedded URL list is actually returned as URLs, rather than strings, it seems to me that the decoding from the href (etc) attributes needs to be done before the URL is created. So yes, I think it needs to be done at the parsing stage. This would also potentially allow for different decoding depending on the document type (html, xhtml). I'll patch this shortly. This issue has been migrated to GitHub: https://github.com/apache/jmeter/issues/1796 |