Bug 36509 - mod_rewrite incorrectly expands userdir URLs?
Summary: mod_rewrite incorrectly expands userdir URLs?
Status: RESOLVED INVALID
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_rewrite (show other bugs)
Version: 2.0.54
Hardware: Other Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-05 20:04 UTC by Alex Jones
Modified: 2005-09-26 17:19 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Jones 2005-09-05 20:04:29 UTC
Example: 

http://localhost/~alex/testsite/blahblahblah

I have a .htaccess file in /home/alex/public_html/testsite that reads:

RewriteEngine On
#RewriteBase /~alex/testsite
RewriteRule ^(.*?)$ url-handler.php

Now according to the manual:

"The RewriteBase directive explicitly sets the base URL for per-directory
rewrites. As you will see below, RewriteRule can be used in per-directory config
files (.htaccess). There it will act locally, i.e., the local directory prefix
is stripped at this stage of processing and your rewriting rules act only on the
remainder. At the end it is automatically added back to the path."

If I leave the RewriteBase uncommented, I get this: "The requested URL
/home/alex/public_html/testsite/url-handler.php was not found on this server."

Seems like some funny behaviour. Please correct me if I am wrong but this setup
works flawlessly with addresses without the ~ as far as I can tell.

Cheers
Comment 1 André Malo 2005-09-06 06:58:11 UTC
Maybe I'm misunderstanding you, but with this setup:

RewriteEngine On
#RewriteBase /~alex/testsite
RewriteRule ^(.*?)$ url-handler.php

mod_rewrite /of course/ takes the absolute path as URL - as documented. The
absolute path itself is expanded by mod_userdir, some time before mod_rewrite in
your .htaccess starts to work.
Comment 2 Alex Jones 2005-09-06 12:54:38 UTC
Hi

What is causing the problem then? The virtual path is expanding from
/~alex/testsite/blahblah to /home/alex/public_html/testsite/blahblah

But that doesn't make any sense, because there is no
/home/alex/public_html/home/alex/public_html/testsite/blahblah

Do you see what is happening?

Are you suggesting this may be a bug with mod_userdir instead?

As I said before, this setup works fine if I use the virtual path
/alex/testsite/blahblah instead.

Thanks
Comment 3 André Malo 2005-09-06 15:01:32 UTC
Your description is a bit ambigious. What exactly is the .htaccess now? Please
paste it here, then we have a reasonable basis for discussions :)
Comment 4 Alex Jones 2005-09-06 16:23:42 UTC
ok:

file:///home/alex/public_html/testsite/.htaccess:

    RewriteEngine On
    RewriteRule ^(.*?)$ url-handler.php

when I request http://localhost/~alex/testsite/blahblah, I am told, by *Apache
through the error page*, that "The requested URL
/home/alex/public_html/testsite/url-handler.php was not found on this server."

This "URL" would correspond to
http://localhost/home/alex/public_html/testsite/url-handler.php, which would map
to
file:///var/www/localhost/htdocs/home/alex/public_html/testsite/url-handler.php
which doesn't exist. (My apologies, I got this wrong before.)

Seeing as the CWD is file:///home/alex/public_html/testsite, I am expecting
apache to find file:///home/alex/public_html/testsite/url-handler.php and not
file:///var/www/localhost/htdocs/home/alex/public_html/testsite/url-handler.php.

Yes, I can set the RewriteBase explicitly to /~alex/testsite but this should be
implied anyway, and as far as I can tell it works exactly as expected on any
virtual path that doesn't contain a ~.

Is this irregular behaviour or is this just a user error?

Thanks
Comment 5 André Malo 2005-09-07 06:35:24 UTC
Ok, so it's a user error ;-)

The request is processed as follows (unimportant parts left out):
1. run mod_userdir -> resolve ~alex/testsite/blah to
/home/alex/public_html/testsite/blah
2. find the .htaccess files
3. execute mod_rewrite found in .htaccess, which strips the .htaccess path from
the requested *filename* before matching the regexes
(/home/alex/public_html/testsite/foo -> foo), applies the rules (foo ->
urlhandler.php) and prefixes the result with rewritebase if the result is not
already an absolute url path (begins with a slash) (urlhandler.php ->
<prefix>/urlhandler.php)

If explicitly set, the prefix is clear. If not set (from RewriteBase docs):
| By default this prefix is the corresponding filepath itself.

so, the last transformation is urlhandler.php ->
/home/alex/public_html/testsite/urlhandler.php, which is treated as new URL. An
internal redirect to this url is issued and you get the 404.
Comment 6 Alex Jones 2005-09-07 11:17:29 UTC
Thanks for your explanation, but is this really expected behaviour?

"By default this prefix is the corresponding filepath itself."

Why are filepaths being used as request URIs? Unless you have DocumentRoot set
at / then this is never going to find any files, surely?

OK I copied the exact files in ~alex/public_html/testsite into
/var/www/localhost/htdocs/alex/testsite and requested
http://localhost/alex/testsite/blah

It works!

However, following the same logic you applied (excuse copy and paste :P):

"1. run mod_userdir (doesn't happen)
1a. resolve /alex/testsite/blah to
/var/www/localhost/htdocs/testsite/blah
2. find the .htaccess files
3. execute mod_rewrite found in .htaccess, which strips the .htaccess path from
the requested *filename* before matching the regexes
(/var/www/localhost/htdocs/testsite/foo -> foo), applies the rules (foo ->
urlhandler.php) and prefixes the result with rewritebase if the result is not
already an absolute url path (begins with a slash) (urlhandler.php ->
<prefix>/urlhandler.php)

"If explicitly set, the prefix is clear. If not set (from RewriteBase docs):
| By default this prefix is the corresponding filepath itself.

"so, the last transformation is urlhandler.php ->
/var/www/localhost/htdocs/testsite/urlhandler.php, which is treated as new URL. An
internal redirect to this url is issued and you get the 404." (No, it's 200 OK I
swear!)

This logic has got to be wrong, because as far as I can see and proven by the
fact that it works, the "prefix" is the original virtual path up as far as the
RewriteRule directive ("/alex/testsite/"). So this would explain why it is
200ing /alex/testsite/url-handler.php ->
/var/www/localhost/htdocs/alex/testsite/url-handler.php

Please don't close this bug just yet! I'm sure something dodgy is going on!

Thanks for your time André.

Alex
Comment 7 Alex Jones 2005-09-07 11:21:33 UTC
Oops that was very sloppy, excuse me I just got out of bed :(

Wherever I said /var/www/localhost/htdocs/testsite I meant
/var/www/localhost/htdocs/alex/testsite.

My bad.
Comment 8 André Malo 2005-09-07 14:58:09 UTC
The reason for this strange default is a technical one. .htaccess files are
bound to the directory they are located in and not to a particular URI.
Furthermore, mod_rewrite has no chance to determine which URI was actually used
*as the base URI* for this directory. That's the reason, RewriteBase was
introduced at all (and makes only sense in .htaccess files / <Directory> sections).

If I would write a new mod_rewrite, I'd abstain from this default and would just
make RewriteBase mandatory ;).
Comment 9 Alex Jones 2005-09-07 17:02:51 UTC
I still don't understand why it works on its own without mod_userdir though.

Sorry if it seems I am being stupid, but it seems like a conflict somehow. Can
anyone else comment?
Comment 10 André Malo 2005-09-07 17:18:03 UTC
Ah. Hmm. Are sure, that there is no mod_userdir active? I mean, somethings maps
~alex/ to /home/alex/public_html in the first place. This is typically
mod_userdir (in the main httpd.conf). Can you attach the main config?
Comment 11 Alex Jones 2005-09-07 20:00:39 UTC
Yeah mod_userdir is active but when I try this .htaccess file anywhere not
involving mod_userdir it works!

For example:

http://localhost/alex/testsite/blah
DocumentRoot is file:///var/www/localhost/htdocs
.htaccess files are checked in all parent directories up to
file:///var/www/localhost/htdocs/alex/testsite/.htaccess.
RewriteRule is found in the last .htaccess file.
mod_rewrite matches the request for "blah" and replaces with "url-handler.php"
as per the RewriteRule
mod_rewrite tags replacement file onto the end of the current path:
"file:///var/www/localhost/htdocs/alex/testsite/" + "urlhandler.php"
request is made for file:///var/www/localhost/htdocs/alex/testsite/url-handler.php
200 OK

Now, with: http://localhost/~alex/testsite/blah
DocumentRoot is file:///var/www/localhost/htdocs
But mod_userdir is enabled... what happens here?
...
...
!?
(something obviously messes up here)
request is somehow made for
file:///var/www/localhost/htdocs/home/alex/public_html/home/alex/public_html/testsite/url-handler.php

From the error log:

[Wed Sep 07 18:58:07 2005] [error] [client 192.168.0.2] File does not exist:
/var/www/localhost/htdocs/home

Hope you can figure this out better than I can!

Cheers André

Alex
Comment 12 Alex Jones 2005-09-27 01:19:35 UTC
Yeah, this is voodoo and I'll just have to live with it. I think I get it now
anyway. Cheers.