This was very weird. Absolutely no scripts or files changed on my website in the last 4 weeks. Apache 2.0.45 was installed on April 23 (two months ago). As of 1.5 weeks ago, my logged in users were being delivered the page successfully, but the script just hangs at the end for exactly 5 minutes then it finished with this error written to the error-log file: (70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed Basically the page would be delivered normally, but the web browser would appear to continue loading the page even though it was already fully delivered. After 5 minutes of this, finally the webserver would say "done" and the browser would stop. Because my database data does change very frequently, I first thought my database data was calling a broken image or something that would cause the browser to hang. Nope. Then I thought perhaps my perl cgi script was broken even though it hadn't changed. But if a perl script actually pauses for minutes, I would expect ps to show the perl script still running, but it was not - ps showed no perl scripts running while the browser was waiting. So I focused on Apache. I read somewhere on bugzilla that apr_bucket_read is caused when a script does not finish in time for the KeepAliveTimeOut value. But I *know* my perl script never takes more than 3 seconds to run. Something in Apache was timing out after 5 minutes because the access-log shows every access to the script would take 303 seconds (3 for it to run, and 300 seconds would be a timeout somewhere). But why was Apache hanging on to the script for 5 minutes?? I read somewhere on a google search that one person had a problem with Apache2 and non-buffered output. So I made just TWO changes to my entire perl script, I removed the $|++ command which turned off auto-flushed output thus enabling buffered output. I also removed a line that simply warns me of a error: warn "weird"; Now the problem went away! Humm, I never noticed the warn command in any of my apache error logs, but this perl script was being called within another perl script, i wonder of a warn message (or a simple STDERR message) from a perl script called within another perl script could cause this problem? I have an example of this problem here. A page that does not have the problem: http://www.hotdlz.com/apachebug1.pl A page that DOES have the problem: http://www.hotdlz.com/apachebug/ Both scripts do a require "/path/other-script.pl" inside. The only difference between the two above scripts is they call a different /path/other-script.pl. The only difference between the two /path/other-script.pl's are the following: $|++; is at the top, and warn "something weird happened"; is located within a foreach loop through the data if something weird happens. The broken script has these two lines, while the fixed script has these two lines taken out. I hope this information helps in identifying the bug, or at least helps point someone else in the right direction to fixing the problem on their site.
by the way, due to security reasons, those URL's shown in the bug report will cease to function in 3 days (on or after Saturday June 21).
If the perl script is not running but Apache is still hanging on a pipe read that is supposedly output from the script, maybe another process inherited the descriptor and has not closed it? Can you try lsof to see if another process has a descriptor to that same pipe? (Yeah, this is probably an impractical thing to test on your real site :( ) A commmon problem is a script that writes a bunch to stderr and ends up hanging on the write because Apache hasn't started reading that pipe yet. (This may be depending on scriptlog setting... can't recall at the moment.) Does the script write a bunch to stderr? This contradicts your statement that ps doesn't show the script still running, but this is a fairly common problem so I have to ask. If you can figure out a small-ish script to allow us to duplicate the problem, we're in business.
I think this bug is related to bug 18348 and bug 20111.
I left the above-mentioned two URL's online, but I made a few changes to them. This is interesting. When I adjust the mysql SQL query to get back only 5 or 10 rows, the bug goes away for the "bug" script. But when I adjust it to get back 71 rows, then the bug exists. Basically, I adjusted my "buged" script: http://www.hotdlz.com/apachebug/ again, that is a script that does a perl "require" to another script that actually does the sql and html output. That require'd script is now set to give out 71 rows from the mysql query, and the foreach loop has a 'warn "foo"' inside each iteration of the loop, and the bugged script has $|=1 at the top too. The other script, http://www.hotdlz.com/apachebug1.pl is identical in that it calls scripts that are identical except the require'd perl script has no $|=1 and no warn "foo" in the foreach. Both do return 71 rows. So the problem script would end up writing "foo" to STDERR over 70 times while the non-problem script would write nothing. I wonder if that has something to do with the problem? Note that I disabled the output of the SQL queries, so if you go to those URL's you won't see much, but you will see apache hang for 5 minutes on www.hotdlz.com/apachebug/ Now for some comments: > Can you try lsof to see if another process has a > descriptor to that same pipe? I'm not sure how to do this, but I'll try: I ran the http://www.hotdlz.com/apachebug/ let it do its pause, then I did this command as root: lsof | grep IPv4 | grep 'my.ip.address' and I found one line: httpd 19860 nobody 37u IPv4 59112398 TCP server.ip.address:http->my.client.ip.address:61975 (ESTABLISHED) In fact that lsof line stays there for the entire 5 minutes. The moment the 5 minute timeout occurrs, Apache writes that ap_content_length_filter: apr_bucket_read() failed line to the error log, and the lsof line disappears. > A commmon problem is a script that writes a bunch > to stderr and ends up hanging on the write because > Apache hasn't started reading that pipe yet. > (This may be depending on scriptlog setting... > can't recall at the moment.) I have no scriptlog setting in my httpd.conf > Does the script write a bunch to stderr? yes, as I mentioned above, the problem script will write a bunch of non- buffered ($|=1) output to STDERR. > This contradicts your statement that ps doesn't > show the script still running actually I believe I was mistaken. A ps does indeed show the perl script is still running. Obviously the first question would be, does the script in fact take 5 minutes to run, but no, if I run the script on the command line with the same enviornment variables as Apache, it runs and terminates normally in about 2 seconds. > If you can figure out a small-ish script to > allow us to duplicate the problem, we're in business. I tried to make a small script that shows this problem, but I had no luck.
Yeah, I think the problem has to do with at least 70 or 80 lines of output to STDERR. I was able to make a small script that shows the problem. http://www.hotdlz.com/bug/ here is the source code, there are two files index.cgi and bug.pl: ----- index.cgi ----- #!/usr/bin/perl print "Content-type: text/html\n"; print "Expires: Sun, 22 Jun 2003 20:27:21 GMT\n"; print "Last-Modified: Sun, 22 Jun 2003 20:27:21 GMT\n"; print "Set-Cookie: user=0\n"; print "\n"; if (1) { require "/my/www/path/bug/bug.pl"; } ----- bug.pl ----- #!/usr/bin/perl $| ++; # non-buffered, it does not matter if you comment this line, bug still exists print "<HTML><BODY>\n"; for ($i=1; $i <= 100; $i++) { warn "foo"; print "$i<BR>"; } ----- END OF SCRIPTS ----- If you adjust the $i <= 100 to something smaller, like 50, there is no 5 minute pause. When I run these scripts through the URL, I do see them in ps and lsof for 5 minutes until apache times it out with the apr_bucket_read error in its error log. I hope this helps.
More information... with $i <= 50 i do see exactly 50 "foo" messages to my apache error log. With $i <= 100, I noticed that there are exactly 200 "foo" messages in my apache error log, not the expected 100. Very weird. No error messages are written to the log during the 5 minute pause, but after 5 minutes is up, there are double the expected messages in my apache error log. I also noticed that once the 5 minutes are up while running this script, there is no "ap_content_length_filter: apr_bucket_read() failed" right away, but the message does appear at odd times... sometimes a minute after I kick off the URL, sometimes a minute after the 5 min pause is completed. Why perl's output to STDERR is important to me is because sometimes often I program I like to have debug output available, sometimes a lot of debug output, especially when I am trying to solve a problem with my code. Having a lot of STDERR messages written to apache's log is very important to me.
I have developed a web based tool which uses Oracle Database and PERL-DBI. Since last few days, one of the pages, where I query database and print out page, hangs. It keeps on saying getting data but it just gets timed out. Same script is running fine on development server (even when I point to live Database from development server) Only difference on two servers is that development server is running Apache 1.2.37 and production server is running Apache 2. In the log file on development server I am getting apr_bucket_read() failed error I am using following cgi modules - use CGI qw(:standard); use DBI; new CGI; my $in=new CGI; print $in->header(); and then print the html using print qq~ < html stuff > ~; I have gone through the post here and want to say, if something is working fine on lower version as per the normal convention it should work in the latest upgrade as well.
Can the original reporter verify that when omitting the stderr output the script works fine? Rabi, is this the situation with your problem as well? Hangs when the script writes enough data to stderr to fill the kernel pipe buffer is is a known bug that is slowly being addressed.
i got this problem when transfer a cgi website from a unix based os(alpha proccessor) to a linux based oS(dual xeon proccessors).
I had same problem going from 1.3.26 to 2.0.47. I have a perl CGI script that talks to MySQL -- queries that returned a lot of data starting hanging after moving from Apache 1.x to 2.x. I changed the begining of the script from "perl -w" to "perl -X" and all problems ceased...
This bug is probably the same as bug #23528. I can confirm that the bug happens for CGI programs that do not write much to stderr and do not fork anything. The bug may be related to the size of the request. The bug is only present in Apache 2.x and has been there for many months.
Can those who can reproduce this bug please try 2.0.50?
Works for me with httpd 2.0.50 on FreeBSD 4.7. I cannot try on Solaris, but that's big progress already! Thank you.
Let's presume that the issues here are all problems related to stderr handling in mod_cgi. Please reopen if anyone can reproduce problems using 2.0.50. *** This bug has been marked as a duplicate of 22030 ***