Bug 56110 - mod_proxy_fcgi bogus "AH01068: Got bogus version 1" message
Summary: mod_proxy_fcgi bogus "AH01068: Got bogus version 1" message
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_proxy_fcgi (show other bugs)
Version: 2.4.7
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-02-05 09:19 UTC by Vedran Rodic
Modified: 2020-03-30 06:18 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vedran Rodic 2014-02-05 09:19:59 UTC
We are using Apache 2.4.7 with mod_proxy_fcgi with PHP-FPM backend, using ProxyPassMatch directive in VirtualHost context. 

Under normal usage everything seems fine, but
we are sometimes getting a strange "AH01068: Got bogus version 1" error message followed by a AH01075: Error dispatching request to : error message.

I did some experimenting yesterday and I managed to reproduce it accessing a 300kb file through a PHP-FPM backend from a server that is about 150 ms away from us. 
To get this error_log message I had to cancel the running http request. I did this by quickly pressing F5 in Google Chrome, especially with unfinished loads (Network tab with 'Preserve Log upon navigation' in Chrome Devloper Tools would show 'cancelled' in the 'status' column of the unfinished requests). 

I'm not sure how to handle this situation, but having a scary looking error message in the error_log file is not very nice. 

Additionally and possibly unrelated, I took a look in the mod_proxy_fcgi.c and found that the way this error is reported doesn't make much sense to me:


            if (version != AP_FCGI_VERSION_1) {
                ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, APLOGNO(01068)
                              "Got bogus version %d", (int) header.version);
                rv = APR_EINVAL;
                break;
            }

AP_FCGI_VERSION_1 value is defined to be 1, but the error message is "AH01068: Got bogus version 1", so this should not be an error, *but* the code is actually referring to the 'header.version' field that apparently has this value. 

Now, either this test is wrong, or the code should use the version variable instead of header.version when reporting the error.
Comment 1 Jeff Trawick 2014-02-05 13:03:56 UTC
The wrong value was logged, so we don't know that the bad number in your situation.  The logging problem is fixed in r1564756.

Since the operational problem (wrong value for protocol version) is only reproducible in an error path, there's probably nothing to do at this time but fix the error message.

You can make this small change to your source and rebuild if you want to see the actual incorrect version value:

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/proxy/mod_proxy_fcgi.c?r1=1564756&r2=1564755&pathrev=1564756

I'll propose that for inclusion in the 2.4.x branch.
Comment 2 Vedran Rodic 2014-02-05 17:12:19 UTC
The patched version reports various values like 146, 23, 33, 244, 105 on similar requests, but with different file sizes (always more than 100kb pumped from php-fpm). 

So I guess the aborted/cancelled requests make the mod_proxy_fcgi access memory that shouldn't even contain the fcgi version. 

I don't know anything about the interaction between apache core, MPM event, mod_proxy and mod_proxy_fcgi, but it seems that it shouldn't report this error.

Should I open another bug? Do you need more information?
Comment 3 Jeff Trawick 2014-02-05 17:28:27 UTC
>The patched version reports various values like 146, 23, 33, 244, 105 on similar requests, but with different file sizes (always more than 100kb pumped from php-fpm). 

That's what I would expect with the fix.  We knew it was a bogus value but didn't know what.

>So I guess the aborted/cancelled requests make the mod_proxy_fcgi access memory that shouldn't even contain the fcgi version. 

There's no stray memory reference.  If even one byte came back from php-fpm then it must be the version (the version is the first byte in the FastCGI header).  And the code seems to ensure that we get an entire FastCGI header, not just the first byte.  The theory that it is looking in the wrong place for the version seems to depend on apr_socket_recv() being busted.

>I don't know anything about the interaction between apache core, MPM event, mod_proxy and mod_proxy_fcgi, but it seems that it shouldn't report this error.

>Should I open another bug? Do you need more information?

The expected problem is that php-fpm and/or mod_proxy_fcgi are out of sync w.r.t. the FastCGI protocol at a higher level than this particular code.  Either mod_proxy_fcgi is trying to read a FastCGI header when it should not, or php-fpm has written something else when it should write the header.  That should be tracked by a different bug, preferably with a repeatable testcase provided so that someone can reproduce fairly quickly.
Comment 4 Scott Lucas 2014-09-01 14:08:48 UTC
(In reply to Vedran Rodic from comment #2)
> The patched version reports various values like 146, 23, 33, 244, 105 on
> similar requests, but with different file sizes (always more than 100kb
> pumped from php-fpm). 
> 
> So I guess the aborted/cancelled requests make the mod_proxy_fcgi access
> memory that shouldn't even contain the fcgi version. 
> 
> I don't know anything about the interaction between apache core, MPM event,
> mod_proxy and mod_proxy_fcgi, but it seems that it shouldn't report this
> error.
> 
> Should I open another bug? Do you need more information?

Vedran, I am still experiencing similar issues using the steps you are mentioning to reproduce (cancelling the requests) on 2.4.10. The version is no longer 1 of course but it isn't clear how to quash the error. Did you ever get to the bottom of why Apache and PHP-FPM are unable to handle these cancelled requests gracefully?

My main concern is that it represents an issue with more than just the cancelled requests and perhaps normal requests are seeing a 503 too.

I am also getting a lot of "AH00485: scoreboard is full, not at MaxRequestWorkers" when using PHP-FPM also. I am not sure if this issue is related or whether there is a misconfiguration with my event MPM. I haven't actually been able to catch the scoreboard when it was full but if it were I don't see how it would not have reached MaxRequestWorkers and be queuing requests.
Comment 5 Jeff Trawick 2014-09-01 14:13:27 UTC
Please open another bug and point to it from here.  This bug was used to a fix an error in logging.
Comment 6 Luca Toscano 2016-07-08 08:32:40 UTC
For more info about the problem and how to reproduce, https://bz.apache.org/bugzilla/show_bug.cgi?id=49671