Bug 23295

Summary: Escape problem in mod_rewrite [P] action...
Product: Apache httpd-2 Reporter: Benoit DEVIJVER <benoit-apache>
Component: mod_rewriteAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED DUPLICATE    
Severity: normal CC: kreucher, trapni
Priority: P3 Keywords: PatchAvailable
Version: 2.0.47   
Target Milestone: ---   
Hardware: PC   
OS: Linux   

Description Benoit DEVIJVER 2003-09-20 18:51:20 UTC
In my frontend apache, I use a .htaccess like this one:
------------------------------------------------
RewriteEngine on
RewriteRule (.*) http://backend/somedir/$1 [P]
------------------------------------------------

but for certain request like this one:
http://frontend/the%20url/pic.jpg
I got a 404 error "http://frontend/the" not found
(because frontend don't escape the request before asking the backend...

I don't know if the fix should be added to mod_rewrite or to mod_proxy...
but I fix like this:

/usr/local/src/httpd-2.0.47/modules/mappers # diff mod_rewrite.c.dist 
mod_rewrite.c
2165a2166
>       r->filename = ap_escape_uri(r->pool, r->filename);

Regards, Benoit DEVIJVER
(perhaps is there a way to do it in the .htaccess, but I didn't find...)
Comment 1 Brian Pinkerton 2004-03-05 18:33:06 UTC
URI's are unescaped when processing starts but aren't escaped before being passed to the proxy.  
The query string needs to be escaped in addition to the filename.  Here are my diffs for 2.0.48:

*** new_mod_rewrite.c   Mon Mar  1 10:49:17 2004
--- mod_rewrite.c       Fri Mar  5 10:23:26 2004
***************
*** 1245,1250 ****
--- 1245,1257 ----
                                            "?", r->args, NULL);
              }
  
+             if (rulestatus != ACTION_NOESCAPE && ((skip = is_absolute_uri(r->filename+6)) > 0)) {
+                 r->filename = escape_absolute_uri(r->pool, r->filename, skip+6);
+                 if (r->args != NULL) {
+                     r->args = ap_escape_uri(r->pool, r->args);
+                 }
+             }
+             
              /* now make sure the request gets handled by the proxy handler */
              r->proxyreq = PROXYREQ_REVERSE;
              r->handler  = "proxy-server";

Comment 2 Brian Pinkerton 2004-04-02 18:54:20 UTC
Ignore that code.  Escaping of URL parts and query terms is happening OK.

For me, the problem was occurring when I moved a piece of text from the path to
the query string.  For example:

RewriteRule ^foo/(.*)   http://another.host.name/foo?page=$1 [qsappend,P]

In this case, $1 will not be escaped correctly for the query string: it could
still contain unescaped ampersands, question marks or equals.  To fix this, one
can easily encode $1 using an external program (or hack the source to use a
form-encoder natively, as I did.)
Comment 3 Nick Kew 2004-09-26 16:20:56 UTC
This looks like a probable duplicate of Bug 13577.

Since that is down as probably fixed in recent patches to mod_proxy, can you
check whether this one is still reproducable with the updates?  The key to it is
the June 29th mod_proxy patch which fixed bug 15207 (mod_proxy mangling URLs in
a very similar manner to this bug).  Need mod_rewrite+proxy users to test
whether it fixes this case too.
Comment 4 H. M. 2006-01-05 19:02:05 UTC
At least, 2.0.54 has same problem.
After per-dir context rewriting ,the request doesn't
handled by canonicalise handler of proxy-http.
then mod_rewrite needs to canonicalise url on the case,
such like redirection case.
I thik that hook-fixup() needs a patch.
Comment 5 Mika Lindqvist 2007-03-08 13:11:30 UTC
*** Bug 32328 has been marked as a duplicate of this bug. ***
Comment 6 Ulf M 2007-03-19 05:47:16 UTC
The problem still occurs in Apache 2.0.59. mod_rewrite with a [P] RewriteRule 
removes the percent-encoding for umlaut characters from the URL.
Comment 7 Ulf M 2007-03-21 10:03:52 UTC
It turns out that the problem only occurs when the RewriteRule is defined 
inside a location. Outside of locations the encoding is correct.

Comment 8 Nick Kew 2007-09-10 05:06:37 UTC
Looks as if the patch for PR #34602 should fix this.  Please reopen if I'm wrong
(and if it's still a bug).

*** This bug has been marked as a duplicate of 34602 ***
Comment 9 roms2000 2007-09-16 14:48:01 UTC
For comment #0

------------------------------------------------
RewriteEngine on
RewriteRule (.*) http://backend/somedir/$1 [P]
------------------------------------------------

I think this work correctly.
If you need to pass the url encoded to the proxied web server, you need to do
this  :

------------------------------------------------------------
First, add this line to the server config :
RewriteMap encode int:escape

Second, use this rewrite rule to your .htaccess file :
RewriteRule (.*) http://backend/somedir/${encode:$1} [P]
------------------------------------------------------------

I'm using a similar configuration and rules to proxy all php files to another
server.
My configuration is :

In server config file :
<IfModule mod_rewrite.c>
        RewriteEngine On
        RewriteMap encode int:escape
</IfModule>

And to process php files on backend server, i also added this to server config
file :
<Files *.php>
        RewriteEngine On
        RewriteOptions inherit
        RewriteCond %{SCRIPT_FILENAME} -f
        RewriteRule (.*)$ http://%{SERVER_NAME}:81${encode:%{REQUEST_URI}} [P]
</Files>