Bug 20036

Summary: Trailing Dots stripped from PATH_INFO environment variable
Product: Apache httpd-2 Reporter: Corey Quinn <corquinn>
Component: mod_rewriteAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: ASSIGNED ---    
Severity: normal CC: andersk, kerry, rodneyarauzs, ross.lawley
Priority: P3    
Version: 2.2.22   
Target Milestone: ---   
Hardware: PC   
OS: All   
URL: Behind Firewall - see description

Description Corey Quinn 2003-05-19 14:37:20 UTC
On Windows Apache is stipping off trailing dots from the PATH_INFO Environment 
Variable.  Our CGI-Application is expecting a complete path and is getting only 
a partial path as it is missing the last dot.  The URL works on our Host 
systems without problem.  We have not tested this on AIX or Linux to see if the 
problem exists there as well.  I have copied the URL and subsequent Environment 
Variables below for reference for both our Windows and MVS systems.  
Unfortunately both URLs are behind a firewall so may not be accessible to you.  
Feel free to query me for any additional information needed.

Thanks, 
Corey Quinn
Software Engineer 
Bookmanager Products  -  IBM


Windows URL (windows NT 4.0 running Apache 2.0.45, problem also exists on 
Windows 2000 running Apache 2.0.40):  

http://klinecor.raleigh.ibm.com/cgi-bin/bookmgr/bookmgr.exe/BOOKS/TEMPLWP2/1.2.?
DT=20030515114849

Environment Variable Dump:

******************************************************
Debug log for CGI invocation: Mon May 19 09:58:20 2003
argv[0]=C:\Program Files\Apache Group\Apache2\cgi-bin\bookmgr\bookmgr.exe, 
cw0=C:\Program Files\Apache Group\Apache2\cgi-bin\bookmgr\, cwd=C:\PROGRA~1
\APACHE~1\Apache2\cgi-bin\bookmgr\
******************************************************

*********************
Environment variables
*********************
SERVER_SOFTWARE=Apache/2.0.45 (Win32)
SERVER_NAME=klinecor.raleigh.ibm.com
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
SERVER_PORT=80
REQUEST_METHOD=GET
HTTP_ACCEPT=image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/vnd.ms-powerpoint, application/vnd.ms-excel, application/msword, 
application/x-quickviewplus, */*
HTTP_HOST=klinecor.raleigh.ibm.com
HTTP_USER_AGENT=Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)
HTTPS=
PATH_INFO=/BOOKS/TEMPLWP2/1.2
PATH_TRANSLATED=C:\Program Files\Apache Group\Apache2\htdocs\BOOKS\TEMPLWP2\1.2
SCRIPT_NAME=/cgi-bin/bookmgr/bookmgr.exe/BOOKS/TEMPLWP2/1.2.
QUERY_STRING=DT=20030515114849
REMOTE_HOST=
REMOTE_ADDR=9.27.13.19
REMOTE_USER=
AUTH_TYPE=
CONTENT_TYPE=
CONTENT_LENGTH=


MVS URL (Host z/OS V1R2 system running IBM HTTP Server):

http://ctfmvs14.raleigh.ibm.com:3080/bookmgr-cgi/EPHBOOKT/BOOKS/TEMPLWP2/1.2.?
DT=20030515114849

Environment Variable dump for Host System:

******************************************************
Debug log for CGI invocation: Mon May 19 10:00:59 2003
argv[0]=/u/booksrv/dev/cgi-bin/EPHBOOKT, cw0=/u/booksrv/dev/cgi-bin/, 
cwd=/u/booksrv/dev/cgi-bin/
******************************************************

*********************
Environment variables
*********************
SERVER_SOFTWARE=IBM HTTP Server/V5R3M0
SERVER_NAME=CTFMVS14.raleigh.ibm.com
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
SERVER_PORT=3080
REQUEST_METHOD=GET
HTTP_ACCEPT=image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/vnd.ms-powerpoint, application/vnd.ms-excel, application/msword, 
application/x-quickviewplus, */*
HTTP_HOST=ctfmvs14.raleigh.ibm.com
HTTP_USER_AGENT=Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0)
HTTPS=OFF
PATH_INFO=/BOOKS/TEMPLWP2/1.2.
PATH_TRANSLATED=/usr/lpp/internet/server_root/pub/BOOKS/TEMPLWP2/1.2.
SCRIPT_NAME=/bookmgr-cgi/EPHBOOKT
QUERY_STRING=DT=20030515114849
REMOTE_HOST=
REMOTE_ADDR=9.27.13.19
REMOTE_USER=
AUTH_TYPE=
CONTENT_TYPE=
CONTENT_LENGTH=
Comment 1 Kerry W. Lothrop 2004-10-07 12:11:57 UTC
The problem also strips off trailing dots from any path element:

For example, if I call

http://localhost/cgi-bin/test.exe/test./test./

PATH_INFO is only set to

/test/test/

The same applies to PATH_TRANSLATED. In the source, I found the following in
function apr_filepath_merge (filepath.c:626):

/* Truncate all trailing spaces and all but the first two dots */
segend = seglen;
while (seglen && (addpath[seglen - 1] == ' ' 
               || addpath[seglen - 1] == '.')) {
    if (seglen > 2 || addpath[seglen - 1] != '.' || addpath[0] != '.')
        --seglen;
    else
        break;
}

So the stipping of trailing dots seems to be intentional for some reason, yet it
was surely written for real file names, not additional information provided for
the PATH_INFO variable.

As a workaround, I am now using REQUEST_URI and stripping/decoding it
appropriately, yet I still think this is a bug in Apache.

The problem is still around in Apache 2.0.52, only on Win32, Netware and OS/2.


Kerry W. Lothrop
Comment 2 Joe Orton 2005-04-04 17:23:46 UTC
I believe that this behaviour is by design because Windows cannot differentiate
between filenames like "foo." and "foo"; marking WONTFIX unless some Win32er can
provide enlightenment.
Comment 3 Kerry W. Lothrop 2005-06-07 15:10:54 UTC
I still believe this to be a bug. I just stumbled across it again while trying 
to get my CGI application running under IIS. My workaround (using REQUEST_URI) 
doesn't work there since REQUEST_URI is not provided (it doesn't appear in the 
CGI specs either). However, PATH_INFO is set correctly there, even if a path 
element ends with a dot.

I believe Apache should not strip the dots off PATH_INFO or PATH_TRANSLATED 
since the CGI application expecting the variables has no other way to figure 
out the actual request.
Comment 4 Davi Arnaut 2007-06-09 07:17:10 UTC
The reporter might have a point here, because according to CGI 1.1 specification
PATH_INFO is the "extra path information, _as given by the client_" (emphasis
mine). Since in Windows "foo" and "foo." are roughly the same file, apache should
have preserved the path information as given by the client.

As for PATH_TRANSLATED, apache don't need to preserve the path information
because it's a server provided translation (virtual-to-physical mapping) of PATH_INFO.
Comment 5 William A. Rowe Jr. 2007-06-10 05:28:34 UTC
100% agreed with Davi's suggestion, PATH_TRANSLATED should clean up the
information into canonical form, however PATH_INFO should not.  I concur
with the reporter that the PATH_INFO behavior is a bug.
Comment 6 Davi Arnaut 2007-06-18 03:50:22 UTC
*** Bug 42686 has been marked as a duplicate of this bug. ***