Bug 51350 - mod_deflate compresses zero length content into an invalid 20 byte body
Summary: mod_deflate compresses zero length content into an invalid 20 byte body
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_deflate (show other bugs)
Version: 2.4.23
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: FixedInTrunk
Depends on:
Blocks:
 
Reported: 2011-06-09 20:42 UTC by Forest
Modified: 2017-02-04 12:37 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Forest 2011-06-09 20:42:22 UTC
When my web application returns a response with an empty body and Content-Type: text/plain or text/html, mod_deflate replaces the body with 20 bytes that can't be decompressed.  I'm guessing this is a gzip header created when trying to deflate 0 bytes of data.

Wireshark reports "Error: Decompression Failed" on those 20 bytes, and I believe some browsers choke on them as well.

Unfortunately, since various web frameworks default to a text Content-Type for empty responses, this means the bad body is pretty common on things like OPTIONS responses.

I have verified that the Content-Type header field triggers the problem by writing special-case code to intercept it on 0-byte responses.  Removing that header field or using an alternative type (e.g. 'application/json') makes the invalid 20-byte body disappear.

I'm using a stock Ubuntu build of Apache.
Comment 1 Stefan Fritsch 2011-07-13 20:39:24 UTC
For me (on trunk and 2.2.19), httpd sends "Content-Length: 20" but no body at all. Is that what you see or do you see those 20 bytes of body data?

In any case, fixed in trunk in r1146418.
Comment 2 Stefan Fritsch 2011-07-13 21:11:37 UTC
(In reply to comment #1)
> For me (on trunk and 2.2.19), httpd sends "Content-Length: 20" but no body at
> all. Is that what you see or do you see those 20 bytes of body data?

Never mind. I just didn't look correctly.

But skipping compression for zero length files is a valid optimization, anyway.
Comment 3 Stefan Fritsch 2012-02-26 17:11:27 UTC
fixed in 2.4.1
Comment 4 Dmitry Gres 2016-06-10 09:10:39 UTC
Still occurs.

When apache returns a gzip compressed response with 204 response code and empty body server returns invalid header Content-Length: 20 instead of Content-Length: 0.

Without gzip compression (without Accept-Encoding header in request) server returns valid header Content-Length: 0.

Request and response with compression:

0 % curl -v http://mta.dev/api/wtf/\?id\=09102 --compressed
* Hostname was NOT found in DNS cache
*   Trying 172.17.0.2...
* Connected to mta.dev (172.17.0.2) port 80 (#0)
> GET /api/wtf/?id=09102 HTTP/1.1
> User-Agent: curl/7.38.0
> Host: mta.dev
> Accept: */*
> Accept-Encoding: deflate, gzip
> 
< HTTP/1.1 204 No Content
< Date: Thu, 09 Jun 2016 15:44:53 GMT
* Server Apache/2.4.7 (Ubuntu) is not blacklisted
< Server: Apache/2.4.7 (Ubuntu)
< X-Powered-By: PHP/5.5.9-1ubuntu4.17
< P3P: policyref="/bitrix/p3p.xml", CP="NON DSP COR CUR ADM DEV PSA PSD OUR UNR BUS UNI COM NAV INT DEM STA"
< X-Powered-CMS: Bitrix Site Manager (d04cd2b3dbab106e7537af3767043172)
< Set-Cookie: PHPSESSID=8arlnd14t1k97bri56clb2qhh1; path=/; HttpOnly
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Set-Cookie: BITRIX_SM_GUEST_ID=2328047; expires=Sun, 04-Jun-2017 15:44:53 GMT; Max-Age=31104000; path=/
< Set-Cookie: BITRIX_SM_LAST_VISIT=09.06.2016+18%3A44%3A53; expires=Sun, 04-Jun-2017 15:44:53 GMT; Max-Age=31104000; path=/
< Content-Encoding: gzip
< Content-Length: 20
< Content-Type: application/json
< 
* Excess found in a non pipelined read: excess = 20 url = /api/wtf/?id=09102 (zero-length body)
* Connection #0 to host mta.dev left intact


Request and response without compression:

0 % curl -v http://mta.dev/api/wtf/\?id\=09102
* Hostname was NOT found in DNS cache
*   Trying 172.17.0.2...
* Connected to mta.dev (172.17.0.2) port 80 (#0)
> GET /api/wtf/?id=09102 HTTP/1.1
> User-Agent: curl/7.38.0
> Host: mta.dev
> Accept: */*
> 
< HTTP/1.1 204 No Content
< Date: Thu, 09 Jun 2016 15:38:43 GMT
* Server Apache/2.4.7 (Ubuntu) is not blacklisted
< Server: Apache/2.4.7 (Ubuntu)
< X-Powered-By: PHP/5.5.9-1ubuntu4.17
< P3P: policyref="/bitrix/p3p.xml", CP="NON DSP COR CUR ADM DEV PSA PSD OUR UNR BUS UNI COM NAV INT DEM STA"
< X-Powered-CMS: Bitrix Site Manager (d04cd2b3dbab106e7537af3767043172)
< Set-Cookie: PHPSESSID=ceqsuv4ie3fkq497uvk6e2gki1; path=/; HttpOnly
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Set-Cookie: BITRIX_SM_GUEST_ID=2328047; expires=Sun, 04-Jun-2017 15:38:43 GMT; Max-Age=31104000; path=/
< Set-Cookie: BITRIX_SM_LAST_VISIT=09.06.2016+18%3A38%3A43; expires=Sun, 04-Jun-2017 15:38:43 GMT; Max-Age=31104000; path=/
< Content-Length: 0
< Content-Type: application/json
< 
* Connection #0 to host mta.dev left intact

Apache version - 2.4.7
Comment 5 nickdnk 2016-11-21 14:13:35 UTC
I am observing this on 2.4.23 still

Also, I am wondering why we're returning a Content-Length header at all, when the HTTP spec says:

"A server MUST NOT send a Content-Length header field in any response
   with a status code of 1xx (Informational) or 204 (No Content)."
Comment 6 Eric Covener 2016-11-21 14:32:56 UTC
(In reply to Dmitry Gres from comment #4)
> Still occurs.
> 

I think the reason the small response optimization doesn't work for some dynamic responses is that httpd only checks the length when the end of the stream is visible as soon as the response is first committed/flushed.

(This does not apply to bailing on a 204 which is always known before the headers are flushed)

I am assuming a simple CGI that flushes reproduces the issues w/ php.
Comment 7 Luca Toscano 2016-11-25 22:59:57 UTC
Does anybody have a simple repro that people can check to test this bug?
Comment 8 Luca Toscano 2016-11-28 19:45:35 UTC
For the moment the only pseudo weird thing that I was able to reproduce is:

test.php
---------------------------
<?php
header("HTTP/1.1 204 No Content");
flush();
ob_flush();
?>
--------------------------

curl localhost/test.php -i --compressed
HTTP/1.1 204 No Content
Date: Mon, 28 Nov 2016 19:43:10 GMT
Server: Apache/2.5.0-dev (Unix)
Content-Length: 0    <<============
Content-Type: text/html; charset=UTF-8

But this one does not trigger any compression, mod_deflate (output filter) is not used as far as I can see. Still working on finding a repro for the 204 use case.
Comment 9 Luca Toscano 2016-11-28 22:12:28 UTC
The motivation for the C-L: 0 header is the following snippet of code in protocol.c:

        if (!(r->header_only
              && !r->bytes_sent
              && (r->sent_bodyct
                  || conf->http_cl_head_zero != AP_HTTP_CL_HEAD_ZERO_ENABLE
                  || apr_table_get(r->headers_out, "Content-Length")))) {
            ap_log_rerror(APLOG_MARK, APLOG_TRACE1, 0, r, "Setting CLEN: %d", r->bytes_sent);
            ap_set_content_length(r, r->bytes_sent);
        }
Comment 10 Luca Toscano 2016-12-05 11:56:51 UTC
(In reply to nickdnk from comment #5)
> I am observing this on 2.4.23 still

I got in touch with nickdnk@ and he'll follow up in this bugzilla ticket as soon as possible, but it seems that he is not able to reproduce the issue anymore.

> Also, I am wondering why we're returning a Content-Length header at all,
> when the HTTP spec says:
> 
> "A server MUST NOT send a Content-Length header field in any response
>    with a status code of 1xx (Informational) or 204 (No Content)."

This seems to be a bug (the only remaining one to discuss). I tried to bypass mod-proxy-fcgi with this simple Perl CGI:

#!/usr/bin/perl
print "Status: 204\n";
print "Content-length: 1000\n";
print "Content-type: text/html\n\n";
print "Hello, World.";

Response with curl --compressed and without it:

HTTP/1.1 204 No Content
Date: Mon, 05 Dec 2016 11:51:33 GMT
Server: Apache/2.5.0-dev (Unix)
Content-length: 1000
Content-Type: text/html

In this case I can get the C-L greater than zero emitted. The body is dropped by httpd as expected.
Comment 11 Luca Toscano 2016-12-06 17:37:47 UTC
(In reply to Luca Toscano from comment #10)

> In this case I can get the C-L greater than zero emitted. The body is
> dropped by httpd as expected.

This seems to be true only if I use Curl, but telnet reveals another story. I am following up in the dev@ mailing list to find a permanent solution.
Comment 12 Luca Toscano 2016-12-09 10:18:56 UTC
committed http://svn.apache.org/r1773346 to trunk
Comment 13 Luca Toscano 2016-12-10 10:54:30 UTC
I tried very hard to reproduce the Content-Length: 20 + Content-Encoding:gzip issue (mod_deflate compressing an empty body) for 204 responses (and others too) without any luck. I keep hitting, correctly, the following mod_deflate statues (LogLevel trace8 enabled):

- "Not compressing very small response of 0 bytes"
- "Not compressing (no content)"

If anybody has a way to easily repro the Content-Length: 20 please let us know :)
Comment 14 Luca Toscano 2016-12-13 20:13:55 UTC
The 204 message-body and C-L header drop change was merged in 2.4.x, it should be part of the next release (2.4.24 atm).

The solution is not addressing the mod_deflate issue, so please report any useful data that could lead to find the bug if you encounter it (CGI repro scripts, httpd config, error logs, etc..).

Thanks!
Comment 15 Luca Toscano 2017-02-04 12:37:32 UTC
Closing this bug as fixed (in 2.4.25+), please reopen if you are still seeing the problem.