Bug 56473

Summary: ETags don't change when headers do
Product: Apache httpd-2 Reporter: Mark Nottingham <mnot>
Component: CoreAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: NEW ---    
Severity: normal CC: julian.reschke
Priority: P2    
Version: 2.5-HEAD   
Target Milestone: ---   
Hardware: All   
OS: All   

Description Mark Nottingham 2014-04-30 10:04:32 UTC
For example:

$> telnet localhost 80
Connected to localhost.
Escape character is '^]'.
GET /foo HTTP/1.1
Host: example.com

HTTP/1.1 200 OK
Date: Wed, 30 Apr 2014 09:32:20 GMT
Server: Apache
Last-Modified: Wed, 30 Apr 2014 09:32:14 GMT
ETag: "0-4f83f39411430"
Accept-Ranges: bytes
Content-Length: 0

$> telnet localhost 80
Connected to localhost.
Escape character is '^]'.
GET /foo HTTP/1.1
Host: example.com

HTTP/1.1 200 OK
Date: Wed, 30 Apr 2014 09:32:53 GMT
Server: Apache
Last-Modified: Wed, 30 Apr 2014 09:32:14 GMT
ETag: "0-4f83f39411430"
Accept-Ranges: bytes
Content-Length: 0
Foo: bar

Here, a new header (Foo) has been inserted with mod_headers (but AFAICT this isn't limited to mod_headers). Note that the ETags are the same.

This causes problems when headers change, but old responses are cached in browsers and/or intermediaries; even when the cached response is stale, Apache will 304 the conditional request, effectively persisting it in-cache forever (until it gets little enough traffic to evict it, which may be never, or until the next change to the body).

For example, a site might introduce or change the Content-Security-Policy response header; however, because of this bug, that change will not be apparent until the end user clears the cache or actually changes the content that the policy is attached to.

As more headers are used to communicate policy like this (as is the trend these days, especially regarding security policy), this impact of bug will become even more problematic, I think.

I realise that fixing this may be difficult; the most obvious thing I can think of would be to take a hash of the response headers (with a blacklist for things like Date, ETag and similar) and use that as input to the ETag content.

The other way to go about fixing it would be to always send "extra" headers (set by mod_headers or otherwise) in a 304 response; however, that would violate a SHOULD NOT in <http://tools.ietf.org/html/draft-ietf-httpbis-p4-conditional-26#section-4.1>, and I suspect support for updating cached headers in 304s is far from universal in clients.

The same issue comes up with Last-Modified, but I don't see any practical way to mitigate that; as long as the browser is doing If-None-Match correctly, it shouldn't matter anyway.
Comment 1 Jim Jagielski 2014-04-30 13:38:10 UTC
As I understand it, the ETag is designed to represent the actual resource of the URL, and as such should not change w/ a change in the headers per se. Adding or changing a header doesn't change the actual resource, does it?
Comment 2 Julian Reschke 2014-04-30 13:47:15 UTC
It's per "representation", and some of header fields are part of the representation. See <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p2-semantics-26.html#rfc.section.3>