Bug 50851

Summary: mod_proxy_fcgi does not comply with RFC 3875 (CGI 1.1)
Product: Apache httpd-2 Reporter: Mark Montague <mark>
Component: Other ModulesAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED FIXED    
Severity: minor CC: ef-lists, jim, mark
Priority: P2    
Version: 2.5-HEAD   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Prevent mod_proxy_fcgi from setting PATH_INFO

Description Mark Montague 2011-03-01 15:30:01 UTC
mod_proxy_fcgi in trunk currently sets PATH_INFO, SCRIPT_NAME, and PATH_TRANSLATED incorrectly per RFC 3875 (CGI 1.1).

I am running httpd 2.3.10 with the following mod_proxy_fcgi configuration:

ProxyPass /test/ fcgi://127.0.0.1:9000/www/php-ssl/

The file /www/php-ssl/test.php contains an HTML page with the single PHP statement: echo "Hello, World!\n"


When an end user requests

https://f14dev1.catseye.org/test/hello.php/some/info?foo=bar&rod=moby

mod_proxy_fcgi sends the following environment variables via the FastCGI protocol to php-fpm (built from PHP 5.3.5):

HTTPS=on
SSL_TLS_SNI=f14dev1.catseye.org
HTTP_HOST=f14dev1.catseye.org
HTTP_USER_AGENT=Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
HTTP_ACCEPT_LANGUAGE=en-us,en;q=0.7,ja;q=0.3
HTTP_ACCEPT_ENCODING=gzip,deflate
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_KEEP_ALIVE=115
HTTP_CONNECTION=keep-alive
PATH=/sbin:/usr/sbin:/bin:/usr/bin
SERVER_SIGNATURE=<address>Apache/2.3.10 (Fedora) Server at <a href="mailto:webmaster@catseye.org">f14dev1.catseye.org</a> Port 443</address>#012
SERVER_SOFTWARE=Apache/2.3.10 (Fedora)
SERVER_NAME=f14dev1.catseye.org
SERVER_ADDR=172.16.168.128
SERVER_PORT=443
REMOTE_ADDR=172.16.168.1
DOCUMENT_ROOT=/www/html-ssl
SERVER_ADMIN=webmaster@catseye.org
SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/www/php-ssl/hello.php/some/info
REMOTE_PORT=50634
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=foo=bar&rod=moby
REQUEST_URI=/test/hello.php/some/info?foo=bar&rod=moby
SCRIPT_NAME=/test
PATH_INFO=/www/php-ssl/hello.php/some/info
PATH_TRANSLATED=/www/html-ssl/www/php-ssl/hello.php/some/info


But RFC 3875 section 3.3 states:

> From the meta-variables thus generated, a URI, the 'Script-URI', can
> be constructed.  This MUST have the property that if the client had
> accessed this URI instead, then the script would have been executed
> with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING
> meta-variables. [...]
>
> script-URI = <scheme> "://" <server-name> ":" <server-port>
>              <script-path> <extra-path> "?" <query-string>
>
> where <scheme> is found from SERVER_PROTOCOL, <server-name>,
> <server-port> and <query-string> are the values of the respective
> meta-variables.  The SCRIPT_NAME and PATH_INFO values, URL-encoded
> with ";", "=" and "?"  reserved, give <script-path> and <extra-path>.

Applying this formula to the environment variables, above, yields a script-URI of:

https://f14dev1.catseye.org:443/test/www/php-ssl/hello.php/some/info?foo=bar&rod=moby

...which is not correct and will fail to execute the same script with the same values for SCRIPT_NAME and PATH_INFO as required by the RFC.  Instead, it should be:

https://f14dev1.catseye.org:443/test/hello.php/some/info?foo=bar&rod=moby

The current (not RFC compliant) behavior is part of a larger problem that prevents php-fpm's status and ping pages from working when php-fpm is used with mod_proxy_fcgi.  I'd like to address the problem on the httpd side first to make it easier to justify patches to php-fpm ("httpd does things the way RFC 3875 says, now php-fpm should be fixed to work with it").


Additional background information:

# /usr/sbin/httpd -V
Server version: Apache/2.3.10 (Unix)
Server built:   Mar  1 2011 12:26:00
Server's Module Magic Number: 20101204:0
Server loaded:  APR 1.4.2, APR-UTIL 1.3.10
Compiled using: APR 1.4.2, APR-UTIL 1.3.10
Architecture:   64-bit
Server MPM:     prefork
  threaded:     no
    forked:     yes (variable process count)
Server compiled with....
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=128
 -D HTTPD_ROOT="/etc/httpd"
 -D SUEXEC_BIN="/usr/sbin/suexec"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="conf/mime.types"
 -D SERVER_CONFIG_FILE="conf/httpd.conf"

php-fpm from PHP 5.3.5
Fedora 14 (fully patched)
gcc 4.5.1

httpd custom built with:

./configure --prefix=/etc/httpd --exec-prefix=/usr --bindir=/usr/bin
--sbindir=/usr/sbin --mandir=/usr/share/man --libdir=/usr/lib64
--sysconfdir=/etc/httpd/conf --includedir=/usr/include/httpd
--libexecdir=/usr/lib64/httpd/modules --datadir=/var/www
--with-installbuilddir=/usr/lib64/httpd/build --with-mpm=prefork
--with-apr=/usr --with-apr-util=/usr --enable-suexec --with-suexec
--with-suexec-caller=apache --with-suexec-docroot=/var/www
--with-suexec-logfile=/var/log/httpd/suexec.log
--with-suexec-bin=/usr/sbin/suexec --with-suexec-uidmin=500
--with-suexec-gidmin=100 --enable-pie --with-pcre --enable-mods-shared=all
--enable-ssl --with-ssl --enable-distcache --enable-proxy --enable-cache
--enable-disk-cache --enable-socache-dc --enable-ldap --enable-authnz-ldap
--enable-cgid --enable-authn-anon --enable-authn-alias --enable-session
--enable-session-cookie --enable-session-dbd --enable-lua --enable-dav-lock
--disable-imagemap
Comment 1 Mark Montague 2011-03-01 15:33:29 UTC
Created attachment 26700 [details]
Prevent mod_proxy_fcgi from setting PATH_INFO

The attached patch fixes the problem by removing the code from modules/proxy/mod_proxy_fcgi.c that sets PATH_INFO.  With PATH_INFO not set, server/util_script.c:ap_add_cgi_vars() no longer sets SCRIPT_NAME incorrectly, and it no longer sets PATH_TRANSLATED at all.

This patch results in the following changes to the environment variables in the original problem description (changes are in a unified diff like format):

-PATH_INFO=/www/php-ssl/hello.php/some/info
-PATH_TRANSLATED=/www/html-ssl/www/php-ssl/hello.php/some/info
-SCRIPT_NAME=/test
+SCRIPT_NAME=/test/hello.php/some/info

The script-URI constructed according to the instructions in RFC 3875 then becomes:

https://f14dev1.catseye.org:443/test/hello.php/some/info?foo=bar&rod=moby

...which is correct.

I believe this to be an acceptable solution to the problem because section 4.1.5 of RFC 3875 says that PATH INFO "identifies the resource or sub-resource to be returned by the CGI script, and is derived from the portion of the URI path hierarchy following the part that identifies the script itself".  Since the proxy cannot know what portion of the URI path represents the script, not setting PATH_INFO seems better than setting it to a value that does not meet this definition, especially since "the server MAY impose restrictions and limitations on what values it permits for PATH_INFO".


The complete list of environment variables generated after the patch is applied when the user requests

https://f14dev1.catseye.org/test/hello.php/some/info?foo=bar&rod=moby

is:

HTTPS=on
SSL_TLS_SNI=f14dev1.catseye.org
HTTP_HOST=f14dev1.catseye.org
HTTP_USER_AGENT=Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13
HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
HTTP_ACCEPT_LANGUAGE=en-us,en;q=0.7,ja;q=0.3
HTTP_ACCEPT_ENCODING=gzip,deflate
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_KEEP_ALIVE=115
HTTP_CONNECTION=keep-alive
PATH=/sbin:/usr/sbin:/bin:/usr/bin
SERVER_SIGNATURE=<address>Apache/2.3.10 (Fedora) Server at <a href="mailto:webmaster@catseye.org">f14dev1.catseye.org</a> Port 443</address>#012
SERVER_SOFTWARE=Apache/2.3.10 (Fedora)
SERVER_NAME=f14dev1.catseye.org
SERVER_ADDR=172.16.168.128
SERVER_PORT=443
REMOTE_ADDR=172.16.168.1
DOCUMENT_ROOT=/www/html-ssl
SERVER_ADMIN=webmaster@catseye.org
SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/www/php-ssl/hello.php/some/info
REMOTE_PORT=50630
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=foo=bar&rod=moby
REQUEST_URI=/test/hello.php/some/info?foo=bar&rod=moby
SCRIPT_NAME=/test/hello.php/some/info
Comment 2 Jim Jagielski 2011-03-03 14:18:31 UTC
Thx for the bug report and the patch. Will review and, if correct, will fold in for 2.3.12...
Comment 3 Mark Montague 2011-03-03 14:23:21 UTC
> The current (not RFC compliant) behavior is part
> of a larger problem that prevents php-fpm's status
> and ping pages from working when php-fpm is used
> with mod_proxy_fcgi.

A fix for the larger problem has now been submitted to the PHP developers:

http://bugs.php.net/bug.php?id=54152

The patch to PHP, if accepted, will also resolve

https://issues.apache.org/bugzilla/show_bug.cgi?id=48273
Comment 4 Jim Jagielski 2011-03-03 14:37:58 UTC
Mark,

What happens if proxy-nocanon is set?
Comment 5 Mark Montague 2011-03-03 15:50:15 UTC
Thanks for the reply, Jim!  I had indeed overlooked and failed to investigate the ProxyPass nocanon option, so your question was helpful.  The short answer is: "the requirements of RFC 3875 are still not met".  Here is the long version of the answer:

Without the patch applied (stock httpd 2.3.10), and with nocanon set:

ProxyPass /test/ fcgi://127.0.0.1:9000/www/php-ssl/ nocanon

the following changes to the environment variables happen compared to the same situation without nocanon set (changes are in a unified diff like format):

-SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/www/php-ssl/hello.php/some/info
+SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/www/php-ssl/hello.php/some/info?foo=bar&rod=moby
-SCRIPT_NAME=/test
+SCRIPT_NAME=/test/hello.php/some/info
-PATH_INFO=/www/php-ssl/hello.php/some/info
+PATH_INFO=/www/php-ssl/hello.php/some/info?foo=bar&rod=moby
-PATH_TRANSLATED=/www/html-ssl/www/php-ssl/hello.php/some/info
+PATH_TRANSLATED=/www/html-ssl/www/php-ssl/hello.php/some/info?foo=bar&rod=moby

Applying these changes to the script-URI formula in RFC 3875 gives:

https://f14dev1.catseye.org:443/test/hello.php/some/info/www/php-ssl/hello.php/some/info?foo=bar&rod=moby?foo=bar&rod=moby

This result is better in some ways and worse in other ways than when the "nocanon" option is omitted, but it still fails to meet the requirement of the RFC in that if that URL were requested, the script would execute with a different value of PATH_INFO.

Also, despite these changes, the php-fpm status and ping pages still fail to work.  Let me know if you'd like information about what php-fpm is doing internally with the environment variables in the non-working (stock httpd) versus working (patched httpd) cases; for the sake of this bug report, I've been only including information about RFC 3875 compliance issues.

Here is the complete set of environment variables that result with stock httpd 2.3.10 and the ProxyPass nocanon option set:

HTTPS=on
SSL_TLS_SNI=f14dev1.catseye.org
HTTP_HOST=f14dev1.catseye.org
HTTP_USER_AGENT=Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.
14) Gecko/20110218 Firefox/3.6.14
HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
HTTP_ACCEPT_LANGUAGE=en-us,en;q=0.7,ja;q=0.3
HTTP_ACCEPT_ENCODING=gzip,deflate
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_KEEP_ALIVE=115
HTTP_CONNECTION=keep-alive
PATH=/sbin:/usr/sbin:/bin:/usr/bin
SERVER_SIGNATURE=<address>Apache/2.3.10 (Fedora) Server at <a href="mailto:webma
ster@catseye.org">f14dev1.catseye.org</a> Port 443</address>#012
SERVER_SOFTWARE=Apache/2.3.10 (Fedora)
SERVER_NAME=f14dev1.catseye.org
SERVER_ADDR=172.16.168.128
SERVER_PORT=443
REMOTE_ADDR=172.16.168.1
DOCUMENT_ROOT=/www/html-ssl
SERVER_ADMIN=webmaster@catseye.org
SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/www/php-ssl/hello.php/some/info?foo=
bar&rod=moby
REMOTE_PORT=51108
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=foo=bar&rod=moby
REQUEST_URI=/test/hello.php/some/info?foo=bar&rod=moby
SCRIPT_NAME=/test/hello.php/some/info
PATH_INFO=/www/php-ssl/hello.php/some/info?foo=bar&rod=moby
PATH_TRANSLATED=/www/html-ssl/www/php-ssl/hello.php/some/info?foo=bar&rod=moby


I also tested with my patch applied together with the ProxyPass nocanon option turned on.  In this case, the requirements of the RFC are met and the php-fpm status (and presumably ping) page works.  Here is the complete set of environment variables with the patch and nocanon option specified (turning nocanon on and off when the patch is applied affects only whether SCRIPT_FILENAME includes the query string):

HTTPS=on
SSL_TLS_SNI=f14dev1.catseye.org
HTTP_HOST=f14dev1.catseye.org
HTTP_USER_AGENT=Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.
14) Gecko/20110218 Firefox/3.6.14
HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
HTTP_ACCEPT_LANGUAGE=en-us,en;q=0.7,ja;q=0.3
HTTP_ACCEPT_ENCODING=gzip,deflate
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_KEEP_ALIVE=115
HTTP_CONNECTION=keep-alive
PATH=/sbin:/usr/sbin:/bin:/usr/bin
SERVER_SIGNATURE=<address>Apache/2.3.10 (Fedora) Server at <a href="mailto:webma
ster@catseye.org">f14dev1.catseye.org</a> Port 443</address>#012
SERVER_SOFTWARE=Apache/2.3.10 (Fedora)
SERVER_NAME=f14dev1.catseye.org
SERVER_ADDR=172.16.168.128
SERVER_PORT=443
REMOTE_ADDR=172.16.168.1
DOCUMENT_ROOT=/www/html-ssl
SERVER_ADMIN=webmaster@catseye.org
SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/www/php-ssl/hello.php/some/info?foo=
bar&rod=moby
REMOTE_PORT=51184
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=foo=bar&rod=moby
REQUEST_URI=/test/hello.php/some/info?foo=bar&rod=moby
SCRIPT_NAME=/test/hello.php/some/info

...giving the correct script-URI::

https://f14dev1.catseye.org:443/test/hello.php/some/info?foo=bar&rod=moby


I apologize for the length of this reply, but I wanted to show that I was thorough, expose any mistakes I may have made for everyone to see, and allow others to compare their results to mine.
Comment 6 Jim Jagielski 2011-03-04 10:04:40 UTC
Instead of always disabling path_info for the fcgi submodule, it is likely better to make it runtime configurable, ala nocanon.

The alternative would be to see if mod_fcgid sets it (and its partners) "correctly" and emulate that. If you have the time to test that, that would be helpful. In the meantime, I'll work on the 'nopathinfo' patch.
Comment 7 Mark Montague 2011-03-04 12:55:27 UTC
Hi, Jim,

I just tested mod_fcgid 2.3.6 under httpd 2.3.10:

- with the foo.pl example from the mod_fcgid manual, with FcgidFixPathinfo both on and off

- with the phpinfo.php example form the mod_fcgid manual, running under php-cgi (from PHP 5.3.6RC2), with FcgidFixPathinfo both on and off

In all four cases, constructing script-URI according to the formula in RFC 3875 gives the correct result.

I also studied the mod_fcgid source code, in particular fcgid_add_cgi_vars() and fcgid_handler().  mod_fcgid creates the SCRIPT_NAME, PATH_INFO, and PATH_TRANSLATED environment variables in the exact same way mod_proxy_fcgi does in an unmodified httpd 2.3.10, by calling:

    ap_add_common_vars(r);
    ap_add_cgi_vars(r);

The reason this yields correct results in the case of mod_fcgid and mod_fcgi is because those modules are executing scripts that reside in the filesystem and URI namespace that httpd has access to, and the script name can be determined during the location, directory, and file walks; the PATH_INFO is then the part of the URI immediately after the script name.

However, a proxy server that is trying to "emulate" a CGI environment (mod_proxy_fcgi, mod_proxy_sapi) is not able to determine what part of the URI is the script name unless it has some insight into the origin server's URI namespace.

What about having the PATH_INFO CGI environment off by default for mod_proxy_fcgi, unless the administrator asks httpd to put *something* there, even if it might be wrong?  That is, a "addpathinfo" patch as opposed to a "nopathinfo" patch?

Thanks for all the time you've been spending on this, and I apologize for unexpectedly taking so much of your time.  Please let me know if there is anything else I can research, test, or write.
Comment 8 Jim Jagielski 2011-03-04 13:32:33 UTC
Fixed in Committed r1078089.

New mod_proxy_fcgi env-var, proxy-fcgi-pathinfo, allows for PATH_INFO to be exposed. Otherwise, it's not.
Comment 9 Jim Jagielski 2011-03-04 13:33:40 UTC
Fixed in Committed r1078089.

New mod_proxy_fcgi env-var, proxy-fcgi-pathinfo, allows for PATH_INFO to be exposed. Otherwise, it's not.
Comment 10 Mark Montague 2011-03-04 17:40:23 UTC
Examined the committed diffs, checked out the latest svn trunk, built it, tested.  Everything looks good.  Thanks again for all the help!
Comment 11 Jim Jagielski 2011-03-09 13:24:49 UTC
*** Bug 48273 has been marked as a duplicate of this bug. ***