Bug 58739

Summary: Undocumented behavior of REDIRECT_URL and REDIRECT_* variables
Product: Apache httpd-2 Reporter: teo8976
Component: DocumentationAssignee: HTTP Server Documentation List <docs>
Status: NEW ---    
Severity: normal    
Priority: P3    
Version: 2.5-HEAD   
Target Milestone: ---   
Hardware: All   
OS: All   

Description teo8976 2015-12-15 23:46:27 UTC
The behavior of the variable REDIRECT_URL is not properly documented, or if it is, it's practically impossible to find in the documentation.

A search of "REDIRECT_URL" in the documentation yields only two significant results (unless more are lost in the pile of garbage results):

1) this: https://httpd.apache.org/docs/2.4/custom-error.html

2) this: https://httpd.apache.org/docs/2.4/mod/core.html


1) is specific to Custom Error Responses directives. Here, the part regarding variables such as REDIRECT_URL is very poorly documented. It says:

"when the error redirect is sent, additional environment variables will be set, which will be generated from the headers provided to the original request by prepending 'REDIRECT_' onto the original header name"

Note that this does NOT explain what "REDIRECT_URL" is supposed to be, since there's no original variable named "URL". 
One can only infer what REDIRECT_URL is by looking at the example. And it is not even very clear: it seems to suggest that REDIRECT_URL is equal to the value of REQUEST_URI prior to the redirection, and that REQUEST_URI is changed to the actual URI of the custom error page, but neither of these are stated clearly (it might be viceversa and it's CRITICAL to be able to tell which one is the case)


2) here the only mention of REDIRECT_URL is in the explanation of the QualifyRedirectURL Directive, which only makes sense to those who already know how REDIRECT_URL is supposed to behave in the first place.


The above would make sense (besides the severe deficiencies in the documentations pointed out in (1)) if REDIRECT_URL was *only* generated by Custom Error Pages directives, but that is not the case.



I have seen REDIRECT_URL, and REDIRECT_STATUS, populated in cases where no Custom Error Response directive was having effect, where only RewriteRule and RewriteCond directives were generating redirects (internal redirects, that is, url rewritings).

So, one must conclude that either:
A) mod_rewrite also sets the REDIRECT_* variables when it rewrites urls. If this is the case, it's TOTALLY UNDOCUMENTED

or:
B) Apache's core or some other module also set the REDIRECT_* variables. If it is the core, there's no documentation about it in https://httpd.apache.org/docs/2.4/mod/core.html which is supposed to describe "Apache Core Features".



---- REAL-LIFE HEADACHE EXAMPLE (which the docs fail to help figure out)  ----

I'll provide an example of a real case where REQUEST_URL's behavior is apparently erratic; I guess the behavior is expected, but the documentation fails to provide the necessary information to figure out what the f**k is going on.
I don't expect anybody to give me the answers here, this is not a support forum. What I do expect you to do is fix the documentation so that the explanation of apparently nonsense behaviors like this can be found in it.


On two similar (but not identical) servers, I set up an apparently identical virtual host with identical PHP scripts.
As you know, PHP scripts receive variables such as REQUEST_URI and REDIRECT_URL (if it exists) in the $_SERVER array.

One of the two servers runs Apache 2.4.4, the other I'm not sure (should be 2.4.x).
The configuration specific to the virtual host is identical on both; however the underlying whole configuration of Apache (httpd.conf and other included files) may have some differences though a glance at the most relevant .conf files hasn't revealed any.

Let's say the virtual host is named mydomain1.com on server 1, and mydomain2.com on server 2.

All I describe from now on applies to both servers. 


The .htaccess in the DocumentRoot folder of the virtual host is like this:

  RewriteEngine On

  #RewriteBase /
  
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_FILENAME} !/(admincp|modcp|clientscript|cpstyles|images)/
  RewriteRule ^(.+)$ vbseo.php [L,QSA]

(I have tried uncommenting the RewriteBase directive and nothing changed)

That is, any url that does not correspond to an existing folder or file is routed to vbseo.php.

There exists an index.php file in the same folder. 

For debugging purposes I replaced the contents of vbseo.php with a simple
  <?php print_r($_SERVER); ?>


On Server 1 I observe apparently sensible behavior, but I can't quite understand the details, and the docs don't help.
On Server 2 I observe completely nonsense behavior, which is either due to a bug in Apache or to a serious misconfiguration; either way, the documentation doesn't give me a f***ing clue.

I issue a "GET /" request to both servers, i.e. opening http://mydomain1.com and http://mydomain2.com from a browser.

----
Here's the output from server 1 (I'll omit the irrelevant parts):
Array
(
    [REDIRECT_STATUS] => 200
    [HTTP_HOST] => mydomain1.com
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36
    [SERVER_NAME] => mydomain1.com
    [DOCUMENT_ROOT] => /path/to/stuff/
    [REQUEST_SCHEME] => http
    [CONTEXT_PREFIX] => 
    [CONTEXT_DOCUMENT_ROOT] => /path/to/stuff/
    [SCRIPT_FILENAME] => /path/to/stuff/vbseo.php
    [REDIRECT_URL] => /index.php
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => 
    [REQUEST_URI] => /
    [SCRIPT_NAME] => /index.php
    [PHP_SELF] => /index.php
)

What I understand should be happening here is:
- I'm requesting /
- because of the rewrite rules, the request is rewritten to vbseo.php
- and indeed vbseo.php is being executed (otherwise I wouldn't be getting that output at all)
- which explains why SCRIPT_FILENAME is /path/to/vbseo.php
- REQUEST_URI remains / because (I infer, and it's what I'd expect, but it isn't stated anywhere in the docs) the rewrite rules don't change its value, by design  

What I don't understand (and the docs fail to provide me the information to figure it out) is:
- why the hell do SCRIPT_NAME and PHP_SELF equal /index.php??
- why the hell REDIRECT_URL is /index.php

At first, I might guess that some core or default configuration dictates that, because we are requesting a folder (the document root folder) without a file name, the request should be forwarded to index.php in that same folder.
However, if this was the case, then, since the RewriteRule in .htaccess is subject to the condition that the request does not match an existing file, and an index.php file DOES exist, then index.php should be executed, but that is not the case: vbseo.php is executed which produces the above output.

From the fact that REQUEST_URI remains equal to the original requested url, I'm inclined to think that REDIRECT_URL is supposed to be the url *to which* the request is redirected (rather than the original one prior to the redirect, which is its meaning as described in the Custom Error Responses docs, but whe don't have docs about REDIRECT_URL unrelated to Custom Errors) - otherwise, in general, REDIRECT_URL wouldn't make sense when rewriting urls with mod_rewrite, because if REQUEST_URI is preserved, REDIRECT_URL, if it was supposed to be the *old* url, wouldn't be needed as it would always equal REQUEST_URI.... 
On the other hand, here the file actually being executed is vbseo.php, and REDIRECT_URL is not /vbseo.php, so this would suggest that REDIRECT_URL actually is the "old" requested uri. But the original requested uri is /, so this could only be explained if there are 2 rewritings here: from "/" to "/index.php" and from "/index.php" to "vbseo.php". But this cannot be the case, because /index.php exists, so the second redirect shouldn't occur.

As you can see, nothing makes sense unless something big is missing in the docs.


----
The output from server 2 is complete nonsense:
Array
(
    [REDIRECT_STATUS] => 200
    [HTTP_HOST] => mydomain2.com
    [HTTP_USER_AGENT] => Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36
    [SERVER_NAME] => mydomain2.com
    [DOCUMENT_ROOT] => /path/to/stuff/
    [REQUEST_SCHEME] => http
    [CONTEXT_PREFIX] => 
    [CONTEXT_DOCUMENT_ROOT] => /path/to/stuff/
    [SCRIPT_FILENAME] => /path/to/stuff/vbseo.php
    [REDIRECT_URL] => http://mydomain2.com        // <<- W.T.F??
    [GATEWAY_INTERFACE] => CGI/1.1
    [SERVER_PROTOCOL] => HTTP/1.1
    [REQUEST_METHOD] => GET
    [QUERY_STRING] => 
    [REQUEST_URI] => /
    [SCRIPT_NAME] => http://mydomain2.com         // <<- W.T.F??
    [PHP_SELF] => http://mydomain2.com            // <<- W.T.F??
)




In this second case, the fact that SCRIPT_NAME and PHP_SELF are absolute fully qualified URLs is a complete screw-up, which I'm sure has an explanation in some misconfiguration, but such an explanation cannot be found in the documentation.
Comment 1 Rich Bowen 2016-07-25 20:56:35 UTC
Where to start ...

Some of the variables you mention, such as PHP_SELF, are set by third-party modules, and so are outside the scope of the documentation.

Beyond that, you appear to be asking for explanations of a wide variety of environment variables. It's a little hard to get past the irate tone of the rant, but we'll try to convert this into an actual request for change, which appears to be:

* Document the meaning of the REDIRECT_URL environment variable, which is set as a side-effect of the use of Redirect and Rewrite directives.

* Document the meaning of the SCRIPT_NAME environment variable, which is set by modules such as mod_cgi and mod_php when dealing with "script" content generators.

Am I correctly getting at the root of this?
Comment 2 Rich Bowen 2016-08-03 17:34:44 UTC
needinfo
Comment 3 teo8976 2016-08-03 20:10:29 UTC
> Beyond that, you appear to be asking for explanations of a wide 
> variety of environment variables.

Not a wide variety at all, just three


> It's a little hard to get past the irate tone of the rant

I have re-read my report, and I don't see where the "irate tone of the rant" becomes in any way an obstacle in understanding the description of the issue which seems pretty clear to me.

Anyway,

> we'll try to convert this into an actual request for change, 

To be more precise, a request for FIX, of a documentation bug


> which appears to be:
> * Document the meaning of the REDIRECT_URL environment variable, which 
> is set as a side-effect of the use of Redirect and Rewrite directives.

Yep, 

> * Document the meaning of the SCRIPT_NAME environment variable

yes, especially how on earth this can be set to an absolute url such as http://somedomain.com. Actually it's hard to see how that can not be a bug.
Remember that two almost identical servers with the same seemingly-relevant directives exhibit different behavior here. The documentation should allow me to figure out what difference in configuration is causing the difference in behavior.


And then you are leaving out PHP_SELF. Despite the fact that it's a PHP-specific variable, there's no way, according to how it's defined in PHP documentation, that it can be an absolute URL; hence it must be Apache doing something to it. And most probably this is related to mod_rewrite or some interaction between mod_rewrite and some other module, as I only observe that when rewrite rules are being applied.
So, there's almost certainly something that is under the responsibility of Apache (as opposed to PHP), affecting this variable, that needs to be documented.
Comment 4 Rich Bowen 2016-08-04 16:06:30 UTC
> > It's a little hard to get past the irate tone of the rant
> 
> I have re-read my report, and I don't see where the "irate tone of the rant"
> becomes in any way an obstacle in understanding the description of the issue
> which seems pretty clear to me.

You'll find that volunteer documentation writers don't usually respond well to repeated profanity. YMMV.
Comment 5 Rich Bowen 2016-08-04 16:36:00 UTC
> The .htaccess in the DocumentRoot folder of the virtual host is like this:
> 
>   RewriteEngine On
> 
>   #RewriteBase /
>   
>   RewriteCond %{REQUEST_FILENAME} !-f
>   RewriteCond %{REQUEST_FILENAME} !-d
>   RewriteCond %{REQUEST_FILENAME}
> !/(admincp|modcp|clientscript|cpstyles|images)/
>   RewriteRule ^(.+)$ vbseo.php [L,QSA]

...

> That is, any url that does not correspond to an existing folder or file is
> routed to vbseo.php.

Ok, if I'm correctly understanding what you're trying to accomplish - that is, map "unhandled" requests to vbseo.php, what you actually want to do, instead of the entire above configuration, is:

FallbackResource /vbseo.php

Of course, this doesn't address the other issue - that the meanings of REDIRECT_URL is undocumented. I think the best thing to do at this point is for us to open a separate ticket requesting that additional detail - possibly in the Redirect and mod_rewrite docs? I'll go do that now.
Comment 6 Rich Bowen 2016-08-04 16:40:51 UTC
See issue BZ59944
Comment 7 teo8976 2016-08-04 18:55:38 UTC
And SCRIPT_NAME, and any effects of mod_rewrite (or any other core module) on PHP_SELF