Bug 61820 - 304 headers stripped
Summary: 304 headers stripped
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.5-HEAD
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-26 23:19 UTC by Mark Nottingham
Modified: 2020-07-01 13:22 UTC (History)
6 users (show)



Attachments
Strip only unwanted headers (2.11 KB, patch)
2020-03-12 18:40 UTC, Giovanni Bechis
Details | Diff
Strip unwanted headers (1.89 KB, patch)
2020-05-27 16:38 UTC, Giovanni Bechis
Details | Diff
Strip unwanted headers without subprocess_env envvar (1.65 KB, patch)
2020-07-01 13:22 UTC, Giovanni Bechis
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Nottingham 2017-11-26 23:19:24 UTC
In http_filters:ap_http_header_filter, 304 responses get special treatment, in that they're sent with only a fixed set of headers (if present):

1414	    if (r->status == HTTP_NOT_MODIFIED) {
1415	        apr_table_do((int (*)(void *, const char *, const char *)) form_header_field,
1416	                     (void *) &h, r->headers_out,
1417	                     "Connection",
1418	                     "Keep-Alive",
1419	                     "ETag",
1420	                     "Content-Location",
1421	                     "Expires",
1422	                     "Cache-Control",
1423	                     "Vary",
1424	                     "Warning",
1425	                     "WWW-Authenticate",
1426	                     "Proxy-Authenticate",
1427	                     "Set-Cookie",
1428	                     "Set-Cookie2",
1429	                     NULL);
1430	    }

<https://svn.apache.org/viewvc/httpd/httpd/trunk/modules/http/http_filters.c?revision=1777672&view=markup#l1414>

This means that any header value that a generator (whether CGI, an upstream origin via mod_proxy, etc.) updates in a 304 will be lost.

RFC7234 specifies how headers on a 304 are supposed to be handled:
  http://httpwg.org/specs/rfc7234.html#freshening.responses

This has caused interoperability problems in the wild, e.g.,:
  https://github.com/hueniverse/hawk/issues/224
Comment 1 Eric Covener 2017-11-28 03:35:57 UTC
Hi Mark, I think some of the lack of action on this is related to this text:

https://tools.ietf.org/html/rfc7232#section-4.1

The server generating a 304 response MUST generate any of the
   following header fields that would have been sent in a 200 (OK)
   response to the same request: Cache-Control, Content-Location, Date,
   ETag, Expires, and Vary.
...

Since the goal of a 304 response is to minimize information transfer
   when the recipient already has one or more cached representations, a
   sender SHOULD NOT generate representation metadata other than the
   above listed fields unless said metadata exists for the purpose of
   guiding cache updates (e.g., Last-Modified might be useful if the
   response does not have an ETag field).


7230 lists a few "representation metadata" but I'm not sure it's meant to be exhaustive, so it's difficult to whitelist or blacklist it.  Any advice on how we differentiate here?

3.1.  Representation Metadata

   Representation header fields provide metadata about the
   representation.  When a message includes a payload body, the
   representation header fields describe how to interpret the
   representation data enclosed in the payload body.  In a response to a
   HEAD request, the representation header fields describe the
   representation data that would have been enclosed in the payload body
   if the same request had been a GET.

   The following header fields convey representation metadata:

   +-------------------+-----------------+
   | Header Field Name | Defined in...   |
   +-------------------+-----------------+
   | Content-Type      | Section 3.1.1.5 |
   | Content-Encoding  | Section 3.1.2.2 |
   | Content-Language  | Section 3.1.3.2 |
   | Content-Location  | Section 3.1.4.2 |
   +-------------------+-----------------+
Comment 2 Mark Nottingham 2017-11-28 07:02:18 UTC
The key word here is "generate." The problem is that when the 304 is being generated elsewhere -- whether by a CGI handler or by an upstream server -- it has authority to decide what headers to include.

I think this could be addressed by removing the code here and imposing this kind of filtering (if necessary) where Apache actually generates a 304 as part of handling a conditional request itself (e.g., from cache, from disk, but not from CGI, etc.).
Comment 3 Roy T. Fielding 2018-10-26 20:42:42 UTC
I can confirm that this is a bug.

The original code that I wrote (in 1997) for the canned response generator (ap_send_error_response) was moved in 2000 to the generic http filter in a mistaken attempt to solve an unrelated bug in 2.0a streams.

RFC2616 did not distinguish between sending and generating, so this used to be a reasonable attempt to adhere to a SHOULD requirement only on responses that the core generated itself. It has no business being applied to all 304 responses.

I think the best solution right now is to simply remove the entire conditional that attempts to reduce 304 responses. If that results in too much data being sent, we should have a configurable list of fields to exclude on 304 responses rather than a fixed table in the source code.
Comment 4 William A. Rowe Jr. 2018-11-07 18:15:01 UTC
Eliminating the filter may leave us exposed...

for all r->header_only and AP_STATUS_HAS_BODY(r->status) requests, reading
rfc7230 3.3.1 and 3.3.2, it would be best to still drop Transfer-Encoding
if we encounter it, it is actively disallowed in 1xx, 204, and of no apparent
benefit to 304 requests. Content-Length can be preserved.
Comment 5 William A. Rowe Jr. 2018-11-07 19:36:12 UTC
(In reply to William A. Rowe Jr. from comment #4)
> Eliminating the filter may leave us exposed...
> 
> for all r->header_only and AP_STATUS_HAS_BODY(r->status) requests, reading
> rfc7230 3.3.1 and 3.3.2, it would be best to still drop Transfer-Encoding
> if we encounter it, it is actively disallowed in 1xx, 204, and of no apparent
> benefit to 304 requests. Content-Length can be preserved.

Specifically, I think we want to apply this test to drop Transfer-Encoding when it is encountered, either here where Roy identified the foolish filtering, or at some later point prior to transmission;

  if (r->header_only || AP_STATUS_IS_HEADER_ONLY(r->status))
     drop T-E;
Comment 6 Mark Nottingham 2018-11-08 01:09:39 UTC
The HTTP WG is discussing the exact list of headers to strip here:
  https://github.com/httpwg/http-core/issues/165

Please join in.
Comment 7 andreas.luik 2020-03-03 15:31:32 UTC
Any progress on this?  We have the problem that Apache httpd (version 2.4.x) used as a reverse proxy removes CORS headers (with a 304 repsonse), resulting in browser access errors.  This makes Apache https useless as a reverse proxy with modern web applications.  Am I missing something?  (We are switching to nginx, which does pass the headers as expected ...)
Comment 8 Giovanni Bechis 2020-03-12 18:40:47 UTC
Created attachment 37097 [details]
Strip only unwanted headers

Following the list of headers Gecko doesn't update on a 304,
the following patch strips only unwanted headers.
Comment 9 Eric Covener 2020-03-12 18:46:19 UTC
Comment on attachment 37097 [details]
Strip only unwanted headers

wdyt about an subprocess_env envvar that allows the old behavior? Even if default behavior changes this allows a downstream user to back out just this bit if it causes some unexpected problem.
Comment 10 Mark Nottingham 2020-03-12 22:50:50 UTC
> Following the list of headers Gecko doesn't update on a 304

Please don't do this; it follows and reinforces demonstrably bad practice based upon misinterpretation of the specification.

Please participate in the standards discussion and implement the resolution of that.
Comment 11 Ruediger Pluem 2020-03-13 07:37:42 UTC
(In reply to Mark Nottingham from comment #10)
> > Following the list of headers Gecko doesn't update on a 304
> 
> Please don't do this; it follows and reinforces demonstrably bad practice
> based upon misinterpretation of the specification.
> 
> Please participate in the standards discussion and implement the resolution
> of that.

So your advice is that we shouldn't change any code until the standards discussion, where we can / should take part, comes to a conclusion what is the correct thing to do?
Comment 12 Mark Nottingham 2020-03-16 23:00:31 UTC
Yes. Hopefully, that will be soon.
Comment 13 Sebastian Kippe 2020-05-26 07:36:53 UTC
Looks like the discussion has been resolved:

https://github.com/httpwg/http-core/issues/165
Comment 14 Ruediger Pluem 2020-05-26 09:13:12 UTC
So removing

TE
Trailer
Transfer-Encoding
Upgrade
Content-Encoding
Content-Length
Content-MD5
Content-Range

should be enough?
mod_proxy_http already removes all headers that are in the connection header of the backend response and we set our own connection and keep-alive header.
Comment 15 Giovanni Bechis 2020-05-26 09:27:29 UTC
Shouldn't ETag be removed as well ?
Comment 16 Ruediger Pluem 2020-05-26 09:41:58 UTC
(In reply to Giovanni Bechis from comment #15)
> Shouldn't ETag be removed as well ?

ETag is not listed here:

https://github.com/httpwg/http-core/pull/337/files
Comment 17 Giovanni Bechis 2020-05-26 11:56:15 UTC
Maybe I am missing something but this is part of the code that has been committed:

Caches MUST-NOT update the following header fields:
Content-Encoding,
Content-Length,
Content-MD5 ("RFC2616#14.15),
Content-Range,
ETag.
Comment 18 Ruediger Pluem 2020-05-26 13:14:47 UTC
(In reply to Giovanni Bechis from comment #17)
> Maybe I am missing something but this is part of the code that has been
> committed:
> 
> Caches MUST-NOT update the following header fields:
> Content-Encoding,
> Content-Length,
> Content-MD5 ("RFC2616#14.15),
> Content-Range,
> ETag.

My bad :-(. I missed ETag. You are correct.
So remove

TE
Trailer
Transfer-Encoding
Upgrade
Content-Encoding
Content-Length
Content-MD5
Content-Range
ETag
Comment 19 Giovanni Bechis 2020-05-27 16:38:58 UTC
Created attachment 37269 [details]
Strip unwanted headers

Add a subprocess_env to restore old behaviour, uncertain about correct naming and about where this should be documented.
Comment 20 Ruediger Pluem 2020-05-28 07:06:52 UTC
I am not sure if we need to be able to provide the old behavior and if yes I think the name of the environment variable should be "better" (whatever that means, yes sweet naming discussions :-)).
Comment 21 Giovanni Bechis 2020-05-28 08:42:27 UTC
Honestly I do not think it makes sense to provide backward compatibility in this case because it would deviate from the standard.
I am all for committing just the portion of the code that strips the correct headers.
Comment 22 Giovanni Bechis 2020-07-01 13:22:35 UTC
Created attachment 37345 [details]
Strip unwanted headers without subprocess_env envvar

Strip unwanted headers without using a subprocess_env var to avoid this behaviour