Bug 22030

Summary: 4097+ bytes of stderr from cgi script causes script to hang
Product: Apache httpd-2 Reporter: Brandon Black <brandon>
Component: mod_cgiAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED FIXED    
Severity: normal CC: agifford, alec, alirette, am, apache-bugzilla-20040415, baustin_ns, bbb, bugzilla-apache, carl, ek, fma.linux, gstein, jeismeie, john.fauerbach, jon5, Lars.Hamann, mholzer, morissette, pete, redwoodtree, rmiller, wscott
Priority: P1    
Version: 2.0.49   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
URL: http://download.softwrap.com/EIDOS/

Description Brandon Black 2003-07-31 19:32:59 UTC
If a cgi script under mod_cgi outputs more than 4096 bytes of stderr before it 
finishes writing to and closing its stdout, the write() inthe cgi script 
containing the 4097th byte of stderr will hang indefinitely, hanging the 
script's execution.

This appears to be cause by the fact that mod_cgi reads all stdout output 
first, and then begins reading stderr output.  APR's file_io which is handling 
the streams will only buffer 4096 characters before further writes by the 
script to stderr will hang, waiting for mod_cgi to read some of the data from 
the stream via APR file_io.

This occured for me where a perl cgi script was producing a large volume of 
harmless warning messages to ssl_error_log before it got to the part of it's 
execution where it actually wrote the stdout output, and causing the script to 
hang and produce no output to the end user.  Below is a test script to 
demonstrate:

#!/usr/bin/perl
# 24x170 = 4080 bytes to stderr
foreach my $x (1..24) {
  print STDERR 'X' x 169 . "\n";
}
# + 17 more bytes, putting us at 4097
# Delete one char from the print below to make
# it work again
print STDERR "0123456789ABCDEF\n";
# Our actual script output, which never comes
print "Content-type: text/plain\n\nASDF\n";
Comment 1 Joshua Slive 2003-08-13 16:13:23 UTC
*** Bug 22318 has been marked as a duplicate of this bug. ***
Comment 2 Jeff Trawick 2003-09-03 13:46:38 UTC
*** Bug 22900 has been marked as a duplicate of this bug. ***
Comment 3 Rob Brown 2003-09-10 19:35:24 UTC
This DoS vulnerability has been tickin me off for two months now. 
The CGI is blocked on a write() to stderr trying so hard to shove the packet 
down Apache's throat and httpd is blocked waiting for something from the CGI's 
stdout, which will never happen until that stderr is consumed, which also 
never happens. 
My system gets hundreds of processes with httpd and the CGI script deadlocked 
with each other because if this issue.  I have to restart apache regularly to 
avoid grinding the server to a pulp from wasted processes or "Out of memory" 
errors. But mostly it just reaches MaxClients all the time which prevents new 
hits from being allowed (thus creating a DoS on my machine). 
I'm surprised mod_cgi was already known to be borked in this way and not 
repaired yet in the cvs source tree. 
Anyone with cvs write access to the httpd repository, I'm begging you to try 
to fix this. 
 
I bricked over modules/generators/mod_cgi.c with Jeff Trawic's version: 
 
http://www.apache.org/~trawick/mod_cgi.c 
 
And suddenly all the problems vanished on my linux box.  Thank you Jeff! 
 
Is there any reason why this is not incorporated into the httpd trunk source 
tree?  Does it break non *NIX platforms?  If so, would it be appropriate to at 
least do something like the following: 
 
#ifdef LINUX 
(new version) 
#endif 
#ifndef LINUX 
(old version) 
#endif 
 
Rolling back to Apache 1.3.28 also eliminates all these problems, but I cannot 
keep running 1.3.x because I need to use the new version of mod_php which is 
not supported as well on the old apache. 
 
Comment 4 Nojan Moshiri 2003-09-10 19:40:57 UTC
Diff the old mod_cgi and new mod_cgi leads to a number of changes.  It would be
great if we could 
get an idea from the developers any pitfalls they may see with going up with the
new mod_cgi.  Is 
it safe to run on production??
Comment 5 Rob Brown 2003-09-10 20:28:42 UTC
Yes, the files seem quite different.  So many changes in fact that I got too 
bored (or lazy) to review everything.  I just used blind faith and replaced 
the whole file.  I'm not sure if anyone is using it on production, but it 
certainly works fine on my development system.  I am going to roll it out on 
my production system now.  It can't be any worse than the old one! 
Comment 6 Rob Brown 2003-09-10 20:31:49 UTC
FYI: I just figured out how to buttwag around this bug until it can be 
repaired.  Just force everyone to put this line at the top of all the perl 
scripts: 
 
use IO::Handle; STDERR->blocking(0); 
 
Everything after the first 4096 bytes to stderr will be dropped, but at least 
the server never falls into deadlock between the httpd and the CGI script.  
 
Comment 7 Nojan Moshiri 2003-09-10 20:35:43 UTC
That's a great idea, maybe I can put that in CGI.pm or something.  But I have over 2000 perl 
scripts!! :-)

Comment 8 Jeff Trawick 2003-09-10 21:17:06 UTC
problems with ~/trawick/mod_cgi.c:

1) buffers up the response, which is really uncool and breaks with cgis that
need to flush or which write huge responses

the code to parse http headers written by the cgi needs to be changed to get rid
of the buffering

handle_script_stdout() needs to know when we've seen all the headers, then
process them, then set ctx->headers_processed

2) doesn't work on the ever-lame win32

groan

3) needs the last few fixes to mod_cgi integraded

4) doesn't help mod_cgid, which is needed by threaded MPMs

5) isn't tested a whole lot

but of course you folks are helping with that

--/--

The main problem to attack is #1...  with that solved, everything else is not so
hard, other than Win32, which doesn't have to be solved.  I'll try to attack #1
now that I see some interest in it.  Alternately, somebody else play with it in
a debugger and see what I mean about needing to recognize when we've read the
entire response header from the CGI and can get into the simple mode where we
pass all output down the filter chain as soon as we read it.
Comment 9 Jeff Trawick 2003-09-22 11:21:57 UTC
*** Bug 10515 has been marked as a duplicate of this bug. ***
Comment 10 Jeff Trawick 2003-09-29 15:05:46 UTC
*** Bug 23473 has been marked as a duplicate of this bug. ***
Comment 11 Nojan Moshiri 2003-09-30 15:49:26 UTC
This bug was issued as an Apache DOS vulnerability in a Symantec Security release yesterday. They 
cite going with the latest CVS release as a workaround.  

Jeff can you provide some guidance on whether  http://www.apache.org/~trawick/mod_cgi.c or 
the latest CVS rev will be the most stable release.  I notice several diffs between the CVS version 
and the one in your home dir.

Thanks!
Comment 12 Jeff Trawick 2003-09-30 16:37:28 UTC
There is no fix in CVS for this problem.  There is no stable mod_cgi[d] that
handles 4097+ bytes from stderr mixed in with stdout processing.  I don't
recommend using any of the code in http://www.apache.org/~trawick/ in a
production environment.

I just uploaded jcgi.tar to www.apache.org/~trawick/.  Module was renamed to
mod_jcgi so that hopefully it doesn't get confused with real code from CVS.
This has fewer big picture problems than the mod_cgi.c hacks I had before, and
of course anyone is free to play with it and comment.  See included STATUS file
for some notes.

For production users: if your CGI spews gobs of stuff to stderr, change the CGI
for now.  For folks debugging CGIs and want to have them temporarily spew gobs
of stuff to stderr, play with the hacked up version mentioned here and send me
testcases for stuff that doesn't work.

As always, anybody should feel free to make alternate changes to the real
mod_cgi[d] and submit patches to dev@httpd.apache.org.
Comment 13 Greg Stein 2003-10-09 07:12:43 UTC
I raised this bug a long while back (Sep 25, 2002, actually:
http://marc.theaimsgroup.com/?l=apache-httpd-dev&m=103291952019514&w=2) and
suggested a new "CGI bucket" type that kept both stdout and stderr descriptors
from the CGI process. When the bucket read() function is called, it would
select() across both descriptors. Content from stdout would spawn a new bucket,
and content from stderr would be logged.

Then wrowe went off with a crazy super-solution which caused a total loss of
focus on the practical problems.

My suggestion still stands: have mod_cgi(d) inject a new CGI_BUCKET into the
filter stack which can drain both streams. No more hangs. Ever. No buffering.
Works for both cgi implementations. Works on Windows (presumably, since we're
using standard apr functions to poll across the two descriptors).
Comment 14 Jeff Trawick 2003-10-09 10:31:42 UTC
mod_cgid does not have this particular hang problem because the script's
stderr refers directly to the error log.

Note that mod_cgid has some other issues with error log, but they are of lesser
significance.  The two I can think of are:

+ writing to syslog doesn't work
+ the main error log is always used, instead of the vhost-specific error log

(there are entries in this bug database already for these issues)
Comment 15 Jeff Trawick 2003-10-09 10:52:35 UTC
Regarding Greg's comments about a special CGI bucket type being produced by mod_cgi:

There is another issue to solve with mod_cgi[d] that exists in 1.3 as well:
hangs will occur if all body data isn't read first, before the script starts
producing output.  Clearly this isn't something that many scripts have
encountered, but solving this enables some interesting CGI behavior.

My own work on this problem has been to handle all three channels (script's
stdin -- request body, stdout, and stderr) right in mod_cgi.  Sending a special
CGI bucket down the filter chain to solve the stdout/stderr problem doesn't deal
with writing request body to the script as the script can handle it.  With the
I/O handled directly in mod_cgi, an extra channel doesn't need a different model.

An unfortunate problem to solve regardless of where stderr is read is that APR
doesn't support polling on pipes on Win32.  In the long term hopefully some
Win32 gurus will provide a workable solution, but in the short term special
handling is required.  (See APR_FILES_AS_SOCKETS.)
Comment 16 Carl Brewer 2004-01-05 02:29:41 UTC
For what it's worth, this is proving to be a real problem for us attempting to
migrate to apache 2.0 from 1.3 on UNIX (linux).  Is anyone actively looking at
this or has it fallen off the radar?
Comment 17 Jeff Trawick 2004-01-05 21:42:25 UTC
Carl, you can try http://www.apache.org/~trawick/jcgi.tar
Comment 18 Carl Brewer 2004-01-05 22:15:22 UTC
Thanks Jeff, is this likely to make it into 2.0.49?  I'm pretty keen on sticking
to production releases on our production servers :)
Comment 19 Jeff Trawick 2004-01-07 12:39:57 UTC
definitely not going to make it to 2.0.49
Comment 20 Carl Brewer 2004-01-07 23:09:58 UTC
*nod*.  This ticket's been open for some time now (some 4 months), do you know
if/when it may be fixed in the release?  2.0.50? :)
Comment 21 Alec Edworthy 2004-03-24 14:04:29 UTC
Is there any news at all on when this bug might get fixed please? Thanks.
Comment 22 Jeff Trawick 2004-03-24 14:15:21 UTC
Scroll up from the bottom of the PR to find an alternate mod_cgi which has a
redesigned interaction with the script.  Little or no feedback on that so far.
Comment 23 Alec Edworthy 2004-03-25 10:19:32 UTC
Thanks Jeff. I gave jcgi a very quick spin a couple of months ago but didn't
manage to make it work (although I didn't try that hard at the time). I will try
again sometime soon and see if I have any more luck.

How urgent is fixing this bug viewed as by those who are actively working on
Apache? Obviously to me it seems pretty important because it breaks all my
scripts (although I'm sure that it could be argued that my scripts are at fault
for sending so much to stderr) but I don't really have that much knowledge of
the internals of Apache and what other issues are outstanding against it at the
moment. Are we likely to see a proper fix for this included in a production
release in the foreseeable future or will work arounds within scripts and fixes
like Jeff's be the norm for now?
Comment 24 Wayne Scott 2004-03-25 13:05:42 UTC
I have to agree with Jeff.   Having the server to hang with no explanation
when your error output reaches some magic threshold is hopeless broken.
It is the type of problem that won't show up in testing, but will break after
deployment.   This is a "I can't trust Apache 2.0" problem.
Comment 25 Jeff Trawick 2004-03-25 13:26:12 UTC
>How urgent is fixing this bug viewed as by those who are actively working on
Apache?

Emperical evidence would suggest that it is not very important.

>Are we likely to see a proper fix for this included in a production
>release in the foreseeable future or will work arounds within scripts
>and fixes like Jeff's be the norm for now?

I have no idea about the first question.

The answer to the second question is, in general, no.  This particular situation
is one which requires a complete redesign of how mod_cgi interacts with scripts.
 I have made a set of code available which for Unix has a design that should
solve this problem, it works for my testcases, etc.

Another unusual example: 2.0.49 provided an overhaul of mod_include with
completely new parsing engine and a number of existing problems resolved.  For
quite a while, people with 2.0.x  mod_include problems were asked to try this
alternate implementation.  After a relatively long time it was merged into 2.0.x
for the 2.0.49 release.

If somebody has time/energy to move the ball forward they can offer their own
solution or try out what I have and offer feedback.

If somebody does not have time/energy to help move the ball forward they can
always buy commercial support for Apache or an Apache-based server and complain
to the vendor that it does not meet their requirements.

Or modify scripts to redirect stderr or not output so much stuff to stderr.
Comment 26 Dave Evans 2004-04-15 06:53:25 UTC
My lame workaround has been to start all my CGIs by re-opening STDERR to a plain
file: open(STDERR, ">>/tmp/error.log").  Yuck.

Without that hack, this is bug a show-stopper for me too - there's no way I
could deploy httpd2 on a system with CGIs I don't 100% trust (e.g. the shared
webserver we virtualhost all our customer's webs on).
Comment 27 Joe Orton 2004-04-15 08:40:55 UTC
How about taking the simpler "CGI bucket" approach for a lower risk change to
incorporate into 2.0 than the fundamental rewrite:

- fix just the regression since 1.3 (not the issue of handling stdin too)
- simple #if APR_FILES_AS_SOCKETS to avoid breaking Win32

I have a patch to implement this based largely on mod_jcgi and the existing
apr_buckets_pipe.c.
Comment 28 Joe Orton 2004-04-15 09:57:46 UTC
Implementation of CGI bucket:

diff against HEAD: http://www.apache.org/~jorton/mod_cgi-HEAD.diff
drop-in replacement for 2.0 mod_cgi.c: http://www.apache.org/~jorton/mod_cgi.c

one known issue: fail gracefully if script closes both stderr and stdout

Further testing welcome.
Comment 29 Jeff Trawick 2004-04-15 11:16:06 UTC
Any reason not to commit to HEAD and get more eyes on it?

(I'll try to do some detailed testing in next 36hr either way.)
Comment 30 Joe Orton 2004-04-15 12:54:34 UTC
OK can do, will resolve that last issue first though.
Comment 31 Rob Brown 2004-04-16 03:05:14 UTC
Joe, you're a total genious!  I patched my httpd.spec file as follows: 
 
---- snip ---- 
=================================================================== 
--- httpd.spec  18 Nov 2003 00:52:34 -0000      1.16 
+++ httpd.spec  16 Apr 2004 02:27:23 -0000 
@@ -33,6 +33,8 @@ 
 Source31: migration.css 
 Source32: html.xsl 
 Source33: README.confd 
+# Add Joe Orton's awesome CGI Bucket feature so large STDERR output won't 
choke anymore! 
+Patch0: http://www.apache.org/~jorton/mod_cgi-HEAD.diff 
 # build/scripts patches 
 Patch1: httpd-2.0.40-apctl.patch 
 Patch2: httpd-2.0.36-apxs.patch 
@@ -128,6 +130,9 @@ 
 fi 
  
 %build 
+ 
+patch modules/generators/mod_cgi.c < $RPM_SOURCE_DIR/mod_cgi-HEAD.diff 
+ 
 # update location of migration guide in apachectl 
 %{__perl} -pi -e "s:\@docdir\@:%{_docdir}/%{name}-%{version}:g" \ 
        support/apachectl.in 
---- snap ---- 
 
And then I rebuilt the package and upgraded the rpm.  (I couldn't use the 
standard rpm "%patch" because I think Joe forgot to include the 
"http-2.0.49/modules/generators/" prefix in the diff headers in his patch 
file.)  After restarting, all my problems immediately disappeared.  I'm 
putting this on my PRODUCTION servers right now.  (I never close STDERR in any 
of my CGIs anyway.) 
 
Thank you! 
 
Comment 32 Alec Edworthy 2004-04-20 11:02:25 UTC
For what it's worth the patch seems to work fine for me. My CGI scripts now
generate the error_log text I would expect and the output appears in the browser
as expected with no delays. Will this fix (or a patch based upon it) get worked
into a proper release sometime in the future?

Thanks Joe!

Alec
Comment 33 Joe Orton 2004-04-21 09:28:39 UTC
Thanks for testing it out.  This will go into a future 2.0 release only if
enough developers have confidence it is suitable for a 2.0 release: the more
reports of successful testing here the more confidence will be inspired.
Comment 34 Joe Orton 2004-05-05 19:40:26 UTC
The fix for this is now committed to HEAD, but needs more testers. 
http://www.apache.org/~jorton/ has:

- mod_cgi.c - a drop-in replacement for the 2.0.49 mod_cgi.c
- mod_cgi-2.0.diff - a diff against the 2.0 mod_cgi.c

Please post any additional results from testing here.
Comment 35 Nic Doye 2004-05-05 22:07:39 UTC
The trivial testing I have done so far on RHEL ES 3 with Interchange 5 (which is
CGI intensive) shows that this patch works perfectly.
Comment 36 Jeff Trawick 2004-05-06 19:20:33 UTC
*** Bug 28816 has been marked as a duplicate of this bug. ***
Comment 37 Carl Brewer 2004-05-27 00:57:07 UTC
This has been marked as closed, but is there any news on which release of httpd2
that the fix will land in?
Comment 38 Joe Orton 2004-05-27 09:11:46 UTC
It requires one more developer vote for inclusion in a future 2.0 release.  The
more people who test it, the better: there are 14 people on the CC list for this
bug but only 3 have taken the time to test out the patches so far.
Comment 39 Joe Orton 2004-06-03 22:34:02 UTC
*** Bug 28025 has been marked as a duplicate of this bug. ***
Comment 40 André Malo 2004-06-11 22:58:08 UTC
*** Bug 29533 has been marked as a duplicate of this bug. ***
Comment 42 Rob Brown 2004-06-16 23:11:31 UTC
Thank you Joe! 
I've been needing this fix for a long time. 
 
-- Rob 
Comment 43 Joe Orton 2004-07-12 15:19:40 UTC
*** Bug 28656 has been marked as a duplicate of this bug. ***
Comment 44 Joe Orton 2004-07-13 19:34:56 UTC
*** Bug 20866 has been marked as a duplicate of this bug. ***
Comment 45 Joe Orton 2004-08-24 20:41:22 UTC
*** Bug 23528 has been marked as a duplicate of this bug. ***
Comment 46 Joe Orton 2004-10-01 18:11:56 UTC
*** Bug 19315 has been marked as a duplicate of this bug. ***
Comment 47 David Trusty 2004-10-14 02:19:58 UTC
I just tried version 2.0.52 and this problem persists.  I am running 
Redhat 9.

Comment 48 Joe Orton 2004-10-14 06:24:03 UTC
David, please open a new bug describing the problems you have with 2.0.52,
include  a reproduction case if possible.  The bug covered here was fixed in 2.0.50.
Comment 49 dougapache 2006-04-20 16:54:59 UTC
I have entered a new bug, 39342, that I believe is related to this. In that
case, mod_cgi is writing a large amount of data to stdout before attempting to
read from stdin, which contains a large POST.