Bug 37888 - access log sampler always encodes url
Summary: access log sampler always encodes url
Status: RESOLVED INVALID
Alias: None
Product: JMeter
Classification: Unclassified
Component: HTTP (show other bugs)
Version: 2.1.1
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: JMeter issues mailing list
URL:
Keywords: TestID
Depends on:
Blocks:
 
Reported: 2005-12-13 17:41 UTC by Dhanush
Modified: 2005-12-14 13:18 UTC (History)
0 users



Attachments
sample lines of log (1.47 KB, application/octet-stream)
2005-12-13 18:06 UTC, Dhanush
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dhanush 2005-12-13 17:41:32 UTC
I am using the access log sampler to populate the URLs for the tests. jmeter is
edcoding all urls by default. It would be great if it didnt encode the urls.
Comment 1 peter lin 2005-12-13 17:47:34 UTC
could you provide a few lines of access logs. It would help debug the issue. the
current implementation of the log parser does not decode the value, so most
likely what is happen is the parser isn't decoding the URL and the httpsampler
is re-encoding the value, which causes double encoding.

peter
Comment 2 Dhanush 2005-12-13 18:06:22 UTC
Created attachment 17210 [details]
sample lines of log

sample lines of log.
Comment 3 peter lin 2005-12-13 18:11:05 UTC
thanks.. I'll take a look tonight and try to fix the bug. it should be straight
forward.

peter
Comment 4 peter lin 2005-12-14 08:45:55 UTC
I've checked in a patch for this. I still need to update the GUI, but by default
the accesslog sampler now decodes the value portion of the request parameters.
please give it a try and let me know if that fixes the problem for you.

peter lin
Comment 5 Sebb 2005-12-14 13:31:30 UTC
The nightly build 2-1.20051214 contains the patch.
Comment 6 Dhanush 2005-12-14 17:26:08 UTC
Hi,

Appreciate your help, but the urls are still encoded (I looked at em from the
server's access logs since the gui was not updated and it still encodes "|" and
":" and "curly braces - {}" and probably many more characters).

Thanks
Dhanush

(In reply to comment #5)
> The nightly build 2-1.20051214 contains the patch.
Comment 7 peter lin 2005-12-14 17:34:54 UTC
Is it the URL of the requested page or the request parameter values that are
incorrectly encoded?  the patch I checked in decodes the values, but not the URL
path portion. I've always tried to stay away from URL encoding the URL of the page.

peter
Comment 8 Dhanush 2005-12-14 17:45:21 UTC
I am talking about the parameters that are passed on.

For example....if the URL thats hitting the server has to be 
http://blah.com/blah.arch?code={464|6784|}

the actual url that hits the server is 
http://blah.com/blah.arch?code=%7B464%7C6784%7C%7D

And all this happens only when I use the access log sampler.

Dhanush

(In reply to comment #7)
> Is it the URL of the requested page or the request parameter values that are
> incorrectly encoded?  the patch I checked in decodes the values, but not the URL
> path portion. I've always tried to stay away from URL encoding the URL of the
page.
> 
> peter

Comment 9 peter lin 2005-12-14 17:54:45 UTC
Is {} even allowed in the URL?  I thought most special characters had to be
URLencoded. From the W3C spec for HTTP http://www.w3.org/Addressing/rfc1738.txt

It states the following:

Unsafe:

   Characters can be unsafe for a number of reasons.  The space
   character is unsafe because significant spaces may disappear and
   insignificant spaces may be introduced when URLs are transcribed or
   typeset or subjected to the treatment of word-processing programs.
   The characters "<" and ">" are unsafe because they are used as the
   delimiters around URLs in free text; the quote mark (""") is used to
   delimit URLs in some systems.  The character "#" is unsafe and should
   always be encoded because it is used in World Wide Web and in other
   systems to delimit a URL from a fragment/anchor identifier that might
   follow it.  The character "%" is unsafe because it is used for
   encodings of other characters.  Other characters are unsafe because
   gateways and other transport agents are known to sometimes modify
   such characters. These characters are "{", "}", "|", "\", "^", "~",
   "[", "]", and "`".


I believe JMeter is handling it correctly, since the specification says curly
braces are not safe.
Comment 10 Dhanush 2005-12-14 22:06:11 UTC
Thanks Peter. I guess we should change the server and client not to send those
characters. Thanks again

Dhanush

(In reply to comment #9)
> Is {} even allowed in the URL?  I thought most special characters had to be
> URLencoded. From the W3C spec for HTTP http://www.w3.org/Addressing/rfc1738.txt
> 
> It states the following:
> 
> Unsafe:
> 
>    Characters can be unsafe for a number of reasons.  The space
>    character is unsafe because significant spaces may disappear and
>    insignificant spaces may be introduced when URLs are transcribed or
>    typeset or subjected to the treatment of word-processing programs.
>    The characters "<" and ">" are unsafe because they are used as the
>    delimiters around URLs in free text; the quote mark (""") is used to
>    delimit URLs in some systems.  The character "#" is unsafe and should
>    always be encoded because it is used in World Wide Web and in other
>    systems to delimit a URL from a fragment/anchor identifier that might
>    follow it.  The character "%" is unsafe because it is used for
>    encodings of other characters.  Other characters are unsafe because
>    gateways and other transport agents are known to sometimes modify
>    such characters. These characters are "{", "}", "|", "\", "^", "~",
>    "[", "]", and "`".
> 
> 
> I believe JMeter is handling it correctly, since the specification says curly
> braces are not safe.

Comment 11 peter lin 2005-12-14 22:18:54 UTC
if you don't want to comply to W3C, you could always enhance the access log
sampler to not encode the parameter values :)

of course, that's not a good idea, but it's one way around it.

peter lin