Bug 58777 - URLs containing a ? are assumed to be query strings
Summary: URLs containing a ? are assumed to be query strings
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_rewrite (show other bugs)
Version: 2.4-HEAD
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk
Depends on:
Blocks:
 
Reported: 2015-12-28 22:13 UTC by Ben RUBSON
Modified: 2016-03-16 08:16 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ben RUBSON 2015-12-28 22:13:34 UTC
Hello,

I'm facing the same issue as described in bug #24155.
https://bz.apache.org/bugzilla/show_bug.cgi?id=24155

This bug has been closed as WONTFIX asking to open a new bug report if still exists in 2.2.
It still exists in 2.2 and 2.4.

I tested the quick fix Matthew Wilcox attached to bug #24155, it works perfectly.
There are only 4 or 5 lines of code to add to mod_rewrite.c

Could we think about adding it to Apache 2.4 please ?

Thank you vey much,

Best regards,

Ben
Comment 1 Yann Ylavic 2015-12-29 01:10:20 UTC
Looks like [L,qdiscard] would work as the proposed [L,nosplit], provided no further rule needs the query-string (hence the L flag).

BTW, I guess the [nosplit] flag comes from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=204508, the patch is missing in bug 24155.
Comment 2 Ben RUBSON 2015-12-30 10:50:08 UTC
I just made a quick test with qsdiscard.
I used the following very simple rule :
RewriteEngine on
RewriteRule ^(.*)$ /usr/local/www/apache24/data/$1 [L,qsdiscard]

On server, the test directory I try to reach :
/usr/local/www/apache24/data/foo?bar/

In access-log :
[30/Dec/2015:11:40:39 +0100] "GET /foo%3fbar/ HTTP/1.1" 404 206

In error-log :
AH00128: File does not exist: /usr/local/www/apache24/data/foo

Conclusion :
the ?.* part in simply ignored / removed from the URI.

Nosplit :
With nosplit, splitout_queryargs() function (in mod_rewrite.c) is not called so that we are sure ?.* remains in the URI.

Note :
Yes you're right, the nosplit flag comes from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=204508, which is linked in bug 24155.
Comment 3 Alexey Melezhik 2016-02-02 17:24:55 UTC
Hi this swat test suite for this issue:

vagrant@Debian-jessie-amd64-netboot:~/my/apache-swat$ swat -t 58777
/home/vagrant/.swat/.cache/19580/prove/58777/foo.txt/00.GET.t ......
ok 1 - GET 127.0.0.1/58777/foo.txt succeeded
# http headers saved to /home/vagrant/.swat/.cache/19580/prove/HuwLhjBM87.hdr
# body saved to /home/vagrant/.swat/.cache/19580/prove/HuwLhjBM87
ok 2 - output match '200 OK'
ok 3 - output match 'foo data'
1..3
ok
/home/vagrant/.swat/.cache/19580/prove/58777/foo?bar.txt/00.GET.t ..
not ok 1 - GET 127.0.0.1/58777/foo?bar.txt succeeded

#   Failed test 'GET 127.0.0.1/58777/foo?bar.txt succeeded'
#   at /usr/local/share/perl/5.20.2/swat.pm line 103.
# curl -X GET -k --connect-timeout 20 -m 20 -L -f -D /home/vagrant/.swat/.cache/19580/prove/gzeDbgx04v.hdr -o /home/vagrant/.swat/.cache/19580/prove/gzeDbgx04v --stderr /home/vagrant/.swat/.cache/19580/prove/gzeDbgx04v.stderr '127.0.0.1/58777/foo?bar.txt'
# stderr:
#   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
#                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
# curl: (22) The requested URL returned error: 404 Not Found
# http headers:
# http body:
# can't continue here due to unsuccessfull http status code
1..1
# Looks like you failed 1 test of 1.
# Looks like your test exited with 1 just after 1.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/1 subtests

Test Summary Report
-------------------
/home/vagrant/.swat/.cache/19580/prove/58777/foo?bar.txt/00.GET.t (Wstat: 256 Tests: 1 Failed: 1)
  Failed test:  1
  Non-zero exit status: 1
Files=2, Tests=4,  1 wallclock secs ( 0.03 usr  0.00 sys +  0.10 cusr  0.00 csys =  0.13 CPU)
Result: FAIL


Apache-swat - https://github.com/melezhik/apache-swat - easy way to reproduce some existed apache issues on _your_ environment
Comment 4 Ben RUBSON 2016-03-08 09:31:09 UTC
Hello,

Any news regarding this issue ?
Could we think about merging the very small patch from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=204508 ?

Thank you very much,

Ben
Comment 5 Eric Covener 2016-03-08 13:59:08 UTC
(In reply to Ben RUBSON from comment #4)
> Hello,
> 
> Any news regarding this issue ?
> Could we think about merging the very small patch from
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=204508 ?
> 
> Thank you very much,
> 
> Ben

What about splitting on the rightmost question mark instead of the leftmost?  That wouldn't need yet another flag.
Comment 6 Ben RUBSON 2016-03-08 14:24:53 UTC
We could have several question marks in a filename, without any QUERY_STRING attached to the URI.
So splitting on the rightmost question mark would not work here.

I think a safe manner would be to split before urldecode :
- filename question marks would then be encoded to %3F, so would not be caught ;
- the QUERY_STRING question mark, a clear ?, would be the character expected to split.
With this method, no additional flag required. However I don't know whether or not mod_rewrite has access to the request before urldecode.

If it's not possible, sounds like nosplit flag is the solution.
Comment 7 Eric Covener 2016-03-08 14:36:28 UTC
(In reply to Ben RUBSON from comment #6)
> We could have several question marks in a filename, without any QUERY_STRING
> attached to the URI.

> So splitting on the rightmost question mark would not work here.

In mod_rewrite, you can always append a final question mark to signify no query string.  I think this has a nice relationship to the requested behavior.

> 
> I think a safe manner would be to split before urldecode :
> - filename question marks would then be encoded to %3F, so would not be
> caught ;
> - the QUERY_STRING question mark, a clear ?, would be the character expected
> to split.
> With this method, no additional flag required. However I don't know whether
> or not mod_rewrite has access to the request before urldecode.

Unfortunately, the splitting happens on the substitution itself, which can have any permutation of encoded/unencoded/captured/ and escaped/not-escaped characters.   Currently the escaping here is really problematic (B, NE, etc and not taking the URL context or the source of the capture into account -- very hairy)
Comment 8 Ben RUBSON 2016-03-08 14:47:14 UTC
(In reply to Eric Covener from comment #7)
> In mod_rewrite, you can always append a final question mark to signify no
> query string.  I think this has a nice relationship to the requested
> behavior.

Which rule would you use to append a final question mark without making mod_rewrite split on the one just before ? Do you have an example in mind ?
Then if this can work, splitting on the rightmost question mark could save this issue yes.
Comment 9 Eric Covener 2016-03-08 15:26:07 UTC
(In reply to Ben RUBSON from comment #8)
> (In reply to Eric Covener from comment #7)
> > In mod_rewrite, you can always append a final question mark to signify no
> > query string.  I think this has a nice relationship to the requested
> > behavior.
> 
> Which rule would you use to append a final question mark without making
> mod_rewrite split on the one just before ? Do you have an example in mind ?
> Then if this can work, splitting on the rightmost question mark could save
> this issue yes.

I don't know what you mean by rule. The idea is after a change to mod_rewrite, Someone affected by this issue that had a rule that substituted a question mark and didn't have a query string would add a ? to the end of their substitution.

But now I am favoring a flag, because of the relatively few people affected here I wouldn't want to create new pain by doing any other heuristic by default.
Comment 10 Ben RUBSON 2016-03-08 15:41:01 UTC
> I don't know what you mean by rule. The idea is after a change to
> mod_rewrite, Someone affected by this issue that had a rule that substituted
> a question mark and didn't have a query string would add a ? to the end of
> their substitution.

Understood, but how would this ? added to the end of the substitution ?
mod_rewrite is not able to distinguish between filename question marks and query string question marks. So without a configuration item (AddFinalQuestionMark <yes|no>), or a new flag, or a new RewriteRule (but which one, this was my question), I don't see how to deal with this issue.

> But now I am favoring a flag, because of the relatively few people affected
> here I wouldn't want to create new pain by doing any other heuristic by
> default.

Flag sounds perfect too yes :)
Comment 11 Eric Covener 2016-03-08 15:42:32 UTC
(In reply to Ben RUBSON from comment #10)
> > I don't know what you mean by rule. The idea is after a change to
> > mod_rewrite, Someone affected by this issue that had a rule that substituted
> > a question mark and didn't have a query string would add a ? to the end of
> > their substitution.
> 
> Understood, but how would this ? added to the end of the substitution ?
> mod_rewrite is not able to distinguish between filename question marks and
> query string question marks. So without a configuration item
> (AddFinalQuestionMark <yes|no>), or a new flag, or a new RewriteRule (but
> which one, this was my question), I don't see how to deal with this issue.

The affected user would specify it, to opt-in to the prior ? being interpreted as a iflename character. It wouldn't be added for them.
Comment 12 Ben RUBSON 2016-03-08 15:46:43 UTC
OK Eric, thank you very much for clarification !

So up to you for implementation method.
The new flag is really quick to implement, patch from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=204508 works as expected (with really minor change for 2.4, goal being to skip splitout_queryargs method).

Thank you !

Ben
Comment 13 Eric Covener 2016-03-09 15:08:25 UTC
http://svn.apache.org/viewvc?rev=1734125&view=rev
Comment 14 Ben RUBSON 2016-03-09 15:23:19 UTC
Eric, thank you very much !

From the documentation : "If no query string is used in the substitution, a question mark can be appended to it in combination with this flag."

How do you append a question mark to the substitution ?

I don't see how to do it without a RewriteRule (for example "RewriteRule ^(/.*)$ $1?"), so without taking the risk to split the URI, if it contains question marks.
Comment 15 Eric Covener 2016-03-09 15:29:41 UTC
(In reply to Ben RUBSON from comment #14)
> Eric, thank you very much !
> 
> From the documentation : "If no query string is used in the substitution, a
> question mark can be appended to it in combination with this flag."
> 
> How do you append a question mark to the substitution ?
> 
> I don't see how to do it without a RewriteRule (for example "RewriteRule
> ^(/.*)$ $1?"), so without taking the risk to split the URI, if it contains
> question marks.

I'm not following. For simplicity, Let's talk only about rewrites where the user is concerned they might be applied to a target with a question mark in the filename.

The user necessarily has a RewriteRule already.The new flag already only applies to a RewriteRule (if you add the flag) that necessarily has a substitution (2nd argument)

If that substitution sets a new query string, no action is required beyond adding the flag, because the added query string will be the right-most match.

If that rule doesn't currently set a query string, you can either append "?" and add the QSA flag or append "?%{QUERY_STRING}" which ensures a right-most match.

This extra stuff is to avoid a solution that only works when there is no actual query string involved, which is my issue with the older proposal.
Comment 16 Ben RUBSON 2016-03-09 16:30:25 UTC
OK Eric, understood.

I made some tests, it works adding a final ? to the substitution, and using qslast flag.

However, I use RewriteRules which do not modify the request : I use RewriteMap (my RewriteMap prg: does not modify the request) and finally proxy redirection (P flag).
Adding a final ? to the "substitution" modifies the user request sent to the proxy (by one char only yes), and as my requests are signed, the authentication process behind the proxy fails...

Any clue to do it without modifying the request ?

Thank you very much Eric !

Ben
Comment 17 Eric Covener 2016-03-09 18:28:52 UTC
> Adding a final ? to the "substitution" modifies the user request sent to the
> proxy (by one char only yes), and as my requests are signed, the
> authentication process behind the proxy fails...
> 
> Any clue to do it without modifying the request ?
> 


I tested with the P flag just now, the trailing "?" is not passed to the origin server. Any chance you are also running with a previous patch that short circuits more processing?

Note there is a bug in the first patch if you are testing it: http://svn.apache.org/viewvc?rev=1734294&view=rev

Can you share rewrite:trace8 logging?
Comment 18 Ben RUBSON 2016-03-09 21:31:19 UTC
OK Eric, I investigated further, I thought I found the right setting, but Apache crashed when requests did not contain a QUERY_STRING.

This is what your last patch 1734294 corrects, so let's try it out !

OK, now everything works as I was expecting, with these 3 tricks :
- append a ? to the RewriteRule substitution (second parameter)
- use the new qslast flag (of course :)
- use the QSA flag (so that if the query contains a QUERY_STRING, it is kept)

Thank you very much for your work on this Eric !

I even tested these 2 patches against Apache 2.2, it works perfectly.

Could we think about backporting them to Apache 2.4 tree, so that they will be available into the next 2.4.x release please ?

Thank you again !

Ben
Comment 19 Eric Covener 2016-03-09 21:36:21 UTC
(In reply to Ben RUBSON from comment #18)
> OK Eric, I investigated further, I thought I found the right setting, but
> Apache crashed when requests did not contain a QUERY_STRING.

sorry about that
> 
> This is what your last patch 1734294 corrects, so let's try it out !
> 
> OK, now everything works as I was expecting, with these 3 tricks :
> - append a ? to the RewriteRule substitution (second parameter)
> - use the new qslast flag (of course :)
> - use the QSA flag (so that if the query contains a QUERY_STRING, it is kept)
> 
> Thank you very much for your work on this Eric !
> 
> I even tested these 2 patches against Apache 2.2, it works perfectly.
> 
> Could we think about backporting them to Apache 2.4 tree, so that they will
> be available into the next 2.4.x release please ?


The first half will be in the next 2.4.x (2.4.19), just need review of the followup. Thanks for testing so quickly.
Comment 20 Ben RUBSON 2016-03-09 21:40:45 UTC
The first half, you mean 1734125 only ?
Comment 21 Eric Covener 2016-03-09 21:42:49 UTC
(In reply to Ben RUBSON from comment #20)
> The first half, you mean 1734125 only ?

yes, the other part is proposed but not yet integrated. I Don't expect any issues.
Comment 22 Ben RUBSON 2016-03-09 21:45:48 UTC
OK, as it resolves a bug, it should be integrated too.

Good news, thank you again !
Comment 23 Ben RUBSON 2016-03-16 07:55:32 UTC
-- just as a reference --

1734125 is visible here :
http://svn.apache.org/viewvc/httpd/httpd/branches/2.4.x/CHANGES?view=log
1734294 is missing for the moment.
Comment 24 Yann Ylavic 2016-03-16 08:14:08 UTC
r1734125/r1734294 were backported to upcoming 2.4.19 in r1734259/r1734397.
Comment 25 Ben RUBSON 2016-03-16 08:16:58 UTC
Thank you very much Yann !