Bug 14227 - Error handling script is not started (error 500) on non-UTF-8 URLs
Summary: Error handling script is not started (error 500) on non-UTF-8 URLs
Status: RESOLVED DUPLICATE of bug 13029
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.0.43
Hardware: PC Windows XP
: P3 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-11-04 15:05 UTC by Wolf-Dietrich Moeller
Modified: 2007-12-22 13:07 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Wolf-Dietrich Moeller 2002-11-04 15:05:36 UTC
If a URI contains bytes above 0x7F which do not result in valid UTF-8
characters, start of a script with the ErrorDocument directive fails.
This happens if the browser sends e.g. Latin-1 characters in the URL either
1) without url-encoding (Netscape 4.7), or
2) url-encoded, but not translated to UTF-8 (Mozilla 5.0).

If the non-UTF-8 bytes occur in the query string part of the URI, it seems that
this crash occurs only without url-encoding (case 1). 

This error prevents customized error handling in case of mis-spelled or
otherwise incorrect URIs.
 
Example URL called (this File does not exist on the server)
in Netscape 4.7: https://localhost/Information/Law/Äs/index.de.html
in Mozilla 5.0:  https://localhost/Information/Law/%C4s/index.de.html

----------------------------------------------
Configuration without ErrorDocument-Directive:

Error message (for both requests):
Forbidden
You don't have permission to access /Information/Law/Äs/index.de.html on this
server.
Apache/2.0.43 Server at localhost Port 443 

Entry in access.log:
Netscape 4.7: 127.0.0.1 - - [03/Nov/2002:10:43:52 +0100] SSLv3 RC4-MD5 "GET
/Information/Law/Äs/index.de.html HTTP/1.0" 403 358 "-" "Mozilla/4.78 [de]
(WinNT; U)"
Mozilla 5.0 : 127.0.0.1 - - [03/Nov/2002:10:28:02 +0100] TLSv1 RC4-MD5 "GET
/Information/Law/%C4s/index.de.html HTTP/1.1" 403 358 "-" "Mozilla/5.0 (Windows;
U; WinNT4.0; en-US; rv:1.2a) Gecko/20020910"

Entries in error.log:
_no_ entries

----------------------------------------------
Configuration with "ErrorDocument 403 /cgi/Invalid.cgi":

Error message (for both requests):
Forbidden
You don't have permission to access /Information/Law/Äs/index.de.html on this
server.
Additionally, a 500 Internal Server Error error was encountered while trying to
use an ErrorDocument to handle the request.
Apache/2.0.43 Server at localhost Port 443

Entry in access.log:
Netscape 4.7: 127.0.0.1 - - [03/Nov/2002:10:50:12 +0100] SSLv3 RC4-MD5 "GET
/Information/Law/Äs/index.de.html HTTP/1.0" 500 - "-" "Mozilla/4.78 [de] (WinNT; U)"
Mozilla 5.0:  127.0.0.1 - - [03/Nov/2002:10:30:30 +0100] TLSv1 RC4-MD5 "GET
/Information/Law/%C4s/index.de.html HTTP/1.1" 500 - "-" "Mozilla/5.0 (Windows;
U; WinNT4.0; en-US; rv:1.2a) Gecko/20020910"

Entries in error.log (same for both requests):
[Sun Nov 03 10:30:30 2002] [error] [client 127.0.0.1] (22)Invalid argument:
couldn't create child process: 22: Invalid.cgi
[Sun Nov 03 10:30:30 2002] [error] [client 127.0.0.1] (22)Invalid argument:
couldn't spawn child process: D:/Web/cgi/Invalid.cgi

-------------------------------------------------
Additional notes:
1) Bug is tested under WinNT4.0 and Windows XP, Apache/2.0.43 (WIN32)
mod_ssl/2.0.43 OpenSSL/0.9.6g.
2) Error 403 is generated instead of 404. Are all non UTF-8 URLs forbidden in
Apache 2 ? (not stated in documentation)
3) This error 403 is not logged in the error log.
4) Internationalized error messages are switched off (if switched on, they work
fine in this case).
5) This is not bug 10573. In bug 10573 the cgi script was called directly, not
via ErrorDocument.
6) For UTF-8 characters the ErrorDocument and script work as expected, e.g. for
https://localhost/Information/Law/%C3%84s/index.de.html
7) Apache 1.3.27 works fine with a similar configuration and calls the error
handling script without problems (see log below).

Apache 1.3.27 configuration with "ErrorDocument 403 /cgi/Invalid.cgi" for
comparison:
Entry in access.log:
Netscape 4.7: 127.0.0.1 - - [03/Nov/2002:11:08:24 +0100] "GET
/Information/Law/\xc4s/index.de.html HTTP/1.0" 404 2325
Mozilla 5.0:  127.0.0.1 - - [03/Nov/2002:11:16:12 +0100] "GET
/Information/Law/%C4s/index.de.html HTTP/1.1" 404 2256
Entry in error.log:
Netscape 4.7: [Sun Nov 03 11:08:24 2002] [error] [client 127.0.0.1] File does
not exist: d:/web/information/law/Äs/index.de.html
Mozilla 5.0:  [Sun Nov 03 11:16:11 2002] [error] [client 127.0.0.1] File does
not exist: d:/web/information/law/Äs/index.de.html
Comment 1 William A. Rowe Jr. 2005-07-12 22:22:41 UTC
  Noting here to extend the 'accept raw bytes in environment table' fixes
  we already commmitted.  This is just another edge case of the same, which
  we can fix in a future release.
Comment 2 William A. Rowe Jr. 2007-12-22 13:07:47 UTC

*** This bug has been marked as a duplicate of 13029 ***