When running a long-running CGI under Apache2, the connection is closed abruptly rather than return an error code: Connection trace: O> GET /cgi-bin/test.cgi HTTP/1.0 O> Connection closed by foreign host. The simple CGI was: #!/bin/sh /bin/date +"`hostname -s`[$$]: I was called at %D %T" >> /tmp/test.log echo "Content-Type: text/plain" echo "" sleep 1000 Setting the 'Timeout' value in Apache down to 5 seconds or so makes this bug easier to debug (rather than the quite high default). Under Apache1.3 we see the behavior: O> GET /cgi-bin/test.cgi HTTP/1.0 O> I> HTTP/1.1 200 OK I> Date: Mon, 23 Apr 2007 06:10:26 GMT I> Server: Apache I> Connection: close I> Content-Type: text/plain Connection closed by foreign host. I suspect that this is similar to the bug 35424 which relates to the same behaviour occurring under Apache 1.3. I will note am surprised that Apache13 returns a '200' reply rather than a timeout, but that's a minor nitpick compared to 224 closing the connection. If you require any further debug output let me know.
How is this a bug? Or in other words, how would you expect the server to behave when a script times out or crashes, having already sent the headers?
(In reply to comment #1) > How is this a bug? > > Or in other words, how would you expect the server to behave when a script times > out or crashes, having already sent the headers? Well, Apache 1.3 behaviour was changed to return a proper HTTP reply (as per the other mentioned BugID #35424). I'm finding that applications I'm using - because they don't receive a proper HTTP reply are subsequently retrying and this is causing issues. I guess the other part that irks me is that this was handled correctly under Apache 1.3 and now is not consistent with that sort of expected response in Apache2.x From the HTTP/1.1 RFC (2616), it looks as though the retry if connection closed is a proper behavior. I just don't believe that closing the connection under a timeout condition is the best response - Particularly given that this is likely to cause a retry. In either case, it's my belief that the server should respond with a '504 Gateway Timeout' under both Apache 1.3 and 2.x (though oddly enough Apache1.3 returns a '200 OK' with the previous bugID mentioned - which seems flawed to me).
Nick's point is that apache sends the headers right away, including the status code. Otherwise it would need to queue up the entire response just to make sure that no error occurs along the way. This would obviously be unacceptable for most situations. So once your script has already sent headers, apache has sent the (200) status code, and there is no way to later go back and say "that should have been 500". On the other hand, I think Nick missed your point that 2.x is not sending the headers at all. I haven't confirmed your tests, but you are right that the 2.x behavior your describe is not optimal.
Created attachment 22067 [details] a single line patch to return rv instead of OK It looks like only this is needed to resolve this. modules/generators/mod_cgi.c ------------------------------ apr_file_close(script_err); - return OK; /* NOT r->status, even if it has changed. + return rv; /* NOT r->status, even if it has changed. } /*============================================================================ I verified using your example |telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /cgi-bin/test.cgi HTTP/1.0 HTTP/1.1 500 Internal Server Error Date: Tue, 03 Jun 2008 14:29:40 GMT Server: Apache/2.3.0-dev (Unix) Content-Length: 535 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>500 Internal Server Error</title> </head><body> <h1>Internal Server Error</h1> <p>The server encountered an internal error or misconfiguration and was unable to complete your request.</p> <p>Please contact the server administrator, you@example.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.</p> <p>More information about this error may be available in the server error log.</p> </body></html> ---------------------------------------------------- A normal cgi script with an error |telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /cgi-bin/printenv HTTP/1.0 HTTP/1.1 500 Internal Server Error Date: Tue, 03 Jun 2008 15:01:32 GMT Server: Apache/2.3.0-dev (Unix) Content-Length: 535 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>500 Internal Server Error</title> </head><body> <h1>Internal Server Error</h1> <p>The server encountered an internal error or misconfiguration and was unable to complete your request.</p> <p>Please contact the server administrator, you@example.com and inform them of the time the error occurred, and anything you might have done that may have caused the error.</p> <p>More information about this error may be available in the server error log.</p> </body></html> ------------------------------------- and a normal working cgi script |telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /cgi-bin/printenv HTTP/1.0 HTTP/1.1 200 OK Date: Tue, 03 Jun 2008 15:01:15 GMT Server: Apache/2.3.0-dev (Unix) Content-Length: 818 Connection: close Content-Type: text/plain; charset=iso-8859-1 DOCUMENT_ROOT="/space/store/apache.26.May/install/htdocs" Please do let me know if this is sufficient.
> |telnet localhost 8080 > Trying 127.0.0.1... > Connected to localhost. > Escape character is '^]'. > GET /cgi-bin/test.cgi HTTP/1.0 > > HTTP/1.1 500 Internal Server Error > Date: Tue, 03 Jun 2008 14:29:40 GMT > Server: Apache/2.3.0-dev (Unix) > Content-Length: 535 > Connection: close > Content-Type: text/html; charset=iso-8859-1 > > <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> > <html><head> > <title>500 Internal Server Error</title> > </head><body> > <h1>Internal Server Error</h1> > <p>The server encountered an internal error or > misconfiguration and was unable to complete > your request.</p> > <p>Please contact the server administrator, > you@example.com and inform them of the time the error occurred, > and anything you might have done that may have > caused the error.</p> > <p>More information about this error may be available > in the server error log.</p> > </body></html> > > A normal cgi script with an error ... Not too sure what you mean here. The original CGI example I posted didn't have any sort of error in it. I don't see that a 500 error should really be responded at all - as I mentioned earlier - it's a timeout - a 504 might be more appropriate. Though having said that, *some* sort of response is better than merely closing the connection. To confirm you've fixed this, does the cgi cause a HTTP '200 OK' if you set the sleep to lower than the Timeout value? Jenna
Yes !cat cgi-bin/test.cgi #!/bin/sh /bin/date +"`hostname -s`[$$]: I was called at %D %T" >> /tmp/test.log echo "Content-Type: text/plain" echo "" sleep 100 |telnet localhost 8080 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /cgi-bin/test.cgi HTTP/1.0 HTTP/1.1 200 OK Date: Wed, 04 Jun 2008 07:19:06 GMT Server: Apache/2.3.0-dev (Unix) Content-Length: 0 Connection: close Content-Type: text/plain The value that is present in rv is returned by the call ap_get_os_error (which is the errornum from the OS). I do not know any mappings that tells me which os error is what for apache, Unfortunately APR_STATUS_IS_TIMEUP, which I expected would understand this does not. an update - instead of returning rv, if not APR_SUCCESS, return an HTTP error code, as rv is not an HTTP error code. (checking rv this way is ok as ap_pass_brigade is the one that sets rv to the above mentioned value and is the last call. All others that set rv returns immediatly) --- modules/generators/mod_cgi.c (revision 663009) +++ modules/generators/mod_cgi.c (working copy) @@ -1021,6 +1021,7 @@ } apr_file_close(script_err); + if (rv != APR_SUCCESS) return HTTP_INTERNAL_SERVER_ERROR; return OK; /* NOT r->status, even if it has changed. */ }
The logic in the default handler for ap_pass_brigade failures should probably be duplicated here (or maybe factored out), though maybe returning 500 for any failure is OK for mod_cgi.
(In reply to comment #7) Are you refering to core.c:default_handler ? It seems to do this, --------------------------------- status = ap_pass_brigade(r->output_filters, bb); if (status == APR_SUCCESS || r->status != HTTP_OK || c->aborted) { return OK; } else { /* no way to know what type of error occurred */ ap_log_rerror(APLOG_MARK, APLOG_DEBUG, status, r, "default_handler: ap_pass_brigade returned %i", status); return HTTP_INTERNAL_SERVER_ERROR; } --------------------------------- is this the same function? (i.e I need to check for r->status and c->aborted ?)
In current versions, the CGI headers are not buffered, so this error doesn't arise. Tested with your script with both mod_cgi and mod_cgid. In a variant where the sleep is moved to before emitting the header, it returns Error 500. As suggested in comments above, 504 would be better. I´m marking this fixed, and committing a /trunk/ fix in r729586 to return 504 instead of 500 when a script times out having returned nothing at all.
Undo spam change