Bug 61820 - 304 headers stripped
Summary: 304 headers stripped
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.5-HEAD
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-26 23:19 UTC by Mark Nottingham
Modified: 2018-11-08 01:09 UTC (History)
2 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Nottingham 2017-11-26 23:19:24 UTC
In http_filters:ap_http_header_filter, 304 responses get special treatment, in that they're sent with only a fixed set of headers (if present):

1414	    if (r->status == HTTP_NOT_MODIFIED) {
1415	        apr_table_do((int (*)(void *, const char *, const char *)) form_header_field,
1416	                     (void *) &h, r->headers_out,
1417	                     "Connection",
1418	                     "Keep-Alive",
1419	                     "ETag",
1420	                     "Content-Location",
1421	                     "Expires",
1422	                     "Cache-Control",
1423	                     "Vary",
1424	                     "Warning",
1425	                     "WWW-Authenticate",
1426	                     "Proxy-Authenticate",
1427	                     "Set-Cookie",
1428	                     "Set-Cookie2",
1429	                     NULL);
1430	    }

<https://svn.apache.org/viewvc/httpd/httpd/trunk/modules/http/http_filters.c?revision=1777672&view=markup#l1414>

This means that any header value that a generator (whether CGI, an upstream origin via mod_proxy, etc.) updates in a 304 will be lost.

RFC7234 specifies how headers on a 304 are supposed to be handled:
  http://httpwg.org/specs/rfc7234.html#freshening.responses

This has caused interoperability problems in the wild, e.g.,:
  https://github.com/hueniverse/hawk/issues/224
Comment 1 Eric Covener 2017-11-28 03:35:57 UTC
Hi Mark, I think some of the lack of action on this is related to this text:

https://tools.ietf.org/html/rfc7232#section-4.1

The server generating a 304 response MUST generate any of the
   following header fields that would have been sent in a 200 (OK)
   response to the same request: Cache-Control, Content-Location, Date,
   ETag, Expires, and Vary.
...

Since the goal of a 304 response is to minimize information transfer
   when the recipient already has one or more cached representations, a
   sender SHOULD NOT generate representation metadata other than the
   above listed fields unless said metadata exists for the purpose of
   guiding cache updates (e.g., Last-Modified might be useful if the
   response does not have an ETag field).


7230 lists a few "representation metadata" but I'm not sure it's meant to be exhaustive, so it's difficult to whitelist or blacklist it.  Any advice on how we differentiate here?

3.1.  Representation Metadata

   Representation header fields provide metadata about the
   representation.  When a message includes a payload body, the
   representation header fields describe how to interpret the
   representation data enclosed in the payload body.  In a response to a
   HEAD request, the representation header fields describe the
   representation data that would have been enclosed in the payload body
   if the same request had been a GET.

   The following header fields convey representation metadata:

   +-------------------+-----------------+
   | Header Field Name | Defined in...   |
   +-------------------+-----------------+
   | Content-Type      | Section 3.1.1.5 |
   | Content-Encoding  | Section 3.1.2.2 |
   | Content-Language  | Section 3.1.3.2 |
   | Content-Location  | Section 3.1.4.2 |
   +-------------------+-----------------+
Comment 2 Mark Nottingham 2017-11-28 07:02:18 UTC
The key word here is "generate." The problem is that when the 304 is being generated elsewhere -- whether by a CGI handler or by an upstream server -- it has authority to decide what headers to include.

I think this could be addressed by removing the code here and imposing this kind of filtering (if necessary) where Apache actually generates a 304 as part of handling a conditional request itself (e.g., from cache, from disk, but not from CGI, etc.).
Comment 3 Roy T. Fielding 2018-10-26 20:42:42 UTC
I can confirm that this is a bug.

The original code that I wrote (in 1997) for the canned response generator (ap_send_error_response) was moved in 2000 to the generic http filter in a mistaken attempt to solve an unrelated bug in 2.0a streams.

RFC2616 did not distinguish between sending and generating, so this used to be a reasonable attempt to adhere to a SHOULD requirement only on responses that the core generated itself. It has no business being applied to all 304 responses.

I think the best solution right now is to simply remove the entire conditional that attempts to reduce 304 responses. If that results in too much data being sent, we should have a configurable list of fields to exclude on 304 responses rather than a fixed table in the source code.
Comment 4 William A. Rowe Jr. 2018-11-07 18:15:01 UTC
Eliminating the filter may leave us exposed...

for all r->header_only and AP_STATUS_HAS_BODY(r->status) requests, reading
rfc7230 3.3.1 and 3.3.2, it would be best to still drop Transfer-Encoding
if we encounter it, it is actively disallowed in 1xx, 204, and of no apparent
benefit to 304 requests. Content-Length can be preserved.
Comment 5 William A. Rowe Jr. 2018-11-07 19:36:12 UTC
(In reply to William A. Rowe Jr. from comment #4)
> Eliminating the filter may leave us exposed...
> 
> for all r->header_only and AP_STATUS_HAS_BODY(r->status) requests, reading
> rfc7230 3.3.1 and 3.3.2, it would be best to still drop Transfer-Encoding
> if we encounter it, it is actively disallowed in 1xx, 204, and of no apparent
> benefit to 304 requests. Content-Length can be preserved.

Specifically, I think we want to apply this test to drop Transfer-Encoding when it is encountered, either here where Roy identified the foolish filtering, or at some later point prior to transmission;

  if (r->header_only || AP_STATUS_IS_HEADER_ONLY(r->status))
     drop T-E;
Comment 6 Mark Nottingham 2018-11-08 01:09:39 UTC
The HTTP WG is discussing the exact list of headers to strip here:
  https://github.com/httpwg/http-core/issues/165

Please join in.