|Summary:||ETag must differ between compressed and uncompressed resource versions|
|Product:||Tomcat 6||Reporter:||Oliver Schoett <oliver.schoett>|
|Component:||Catalina||Assignee:||Tomcat Developers Mailing List <dev>|
Patch to correct ETags and Vary headers for compression
Disable sending and interpreting ETags (needs to be made into an option)
Description Oliver Schoett 2009-01-15 05:19:32 UTC
The Apache folks are about to fix the problem that ETags are the same for compressed and uncompressed versions of a resource: https://issues.apache.org/bugzilla/show_bug.cgi?id=39727 Tomcat 6.0.18 suffers from the same problem. The effect is that if a caching proxy holds a gzipped version of a resource and is asked by a client for an unzipped version, it requests one from the server with the ETag of the cached version. The server sees that the ETag of the version it would send out is the same as that of the version the cache already holds and tells the cache that its version is OK (response status code 304). In the case of a Squid cache, this results in a gzipped version to be sent to the client, and this breaks in IE6 and IE7 when they are configured to use the HTTP 1.0 protocol. Squid has been provided with a work-around option for this problem: http://www.squid-cache.org/Versions/v2/2.6/cfgman/broken_vary_encoding.html but we should not rely on caches world-wide to provide a work-around for a Tomcat bug.
Comment 1 Oliver Schoett 2009-01-29 03:42:56 UTC
Created attachment 23190 [details] Patch to correct ETags and Vary headers for compression Here is a patch that corrects the ETag and Vary behaviour: - ETags differ for gzipped and ungzipped output - Vary: Accept-Encoding ist sent whenever a gzipped version is available The latter change makes it possible for users of differently capable browsers to receive gzipped and ungzipped reponses through the same proxy cache. Previously, an ungzipped cached version would be delivered also to compression-capable browsers, because the cache could not know there was a gzipped version available. This patch will be put in production on an e-commerce website shortly.
Comment 2 Oliver Schoett 2009-01-29 03:56:52 UTC
Comment 3 Oliver Schoett 2009-01-29 09:05:04 UTC
Warning: the patch I submitted does not work well in connection with the Akamai CDN. First, the Akamai edge servers transparently decompress content without changing the ETag (so that compressed and uncompressed versions are sent with the same Etag). Second, the Akamai servers treat responses with Vary: Accept-Encoding but without Content-Encoding header as uncacheable (ESConfigGuide-Customer, p. 54, Note: TTL and the Vary Header). My patch triggers this in the case of uncompressed responses (due to missing client capability) that the server would be willing to compress.
Comment 4 Mark Thomas 2009-04-16 12:34:10 UTC
Thanks for the patch. I have applied a modified version of it to trunk that also extended it to the NIO and APR connectors. The extended patch has been proposed for 6.0.x
Comment 5 Remy Maucherat 2009-04-16 13:57:58 UTC
I disagree with this. Regardless on what happens with the transport, the entity does not change once it is decoded. -1 for this "fix".
Comment 6 Mark Thomas 2009-04-16 14:16:56 UTC
The I suggest you read section 14.19 of RFC 2616 that makes it quite clear ETags are per variant not per resource.
Comment 7 Remy Maucherat 2009-04-16 14:35:15 UTC
Well, that does not sound very smart (and I had read that on the httpd bug, sigh ...). But overall, I do think the patch is bad (see status file).
Comment 8 Mark Thomas 2009-04-17 01:44:38 UTC
I've reverted the fix from trunk and withdrawn the backport proposal as whilst it fixed this issue, it introduced others.
Comment 9 Oliver Schoett 2009-04-17 07:59:02 UTC
Comment 10 Remy Maucherat 2009-04-17 08:17:37 UTC
Yes, that's my point, the only solution I see in Tomcat 6 about this is an option to remove the etag if compression is active for the request. And about your spec quoting, it is great to adhere to specs and stuff, but it might be that clients apparently only really support content-encoding, which is not supposed to be used for on-the-fly compression (but is, since I am very not sure about support for transfer-encoding which is the proper way to do that; originally, I had planned all sorts of filters which would be added according to the T-E header, but in the end, the only thing which was workable then was a hardcoded gzip output filter which used the content-encoding header). You have to do things which work ...
Comment 11 Mark Thomas 2009-04-29 03:00:51 UTC
The current state of T-E support in the browsers is: - Opera advertises T-E support, works with T-E - Moziila doesn't advertise T-E support, works with T-E - IE doesn't advertise T-E support, doesn't work with T-E My reading of the C-E discussion above is that any solution is a hack that will have an issue somewhere. T-E is the right solution. Moving from the current status quo is as likely or more likely to cause issues compared to the current behaviour which while wrong, is at least understood. We could provide a handful of options to allow users to configure the various hacks but this would add a lot of code (and possibly complexity) to the critical path. I would like to use T-E by default and fallback to C-E if T-E is not supported. However, the patchy browser support means that another set of options would be required to give folks a reasonable chance of configuring a 'good' behaviour for most clients. My inclination is to mark this issue as WONTFIX with the longer term plan being implementing T-E and switching to T-E once the browser support is reasonable.
Comment 12 Remy Maucherat 2009-04-29 05:39:05 UTC
I think I used mostly IE when I tried it back then. Did you test with IE 7 and 8 ? I agree with this kind of browser support, it is still not doable to use T-E :(
Comment 13 Mark Thomas 2009-04-29 06:01:47 UTC
IE7 and IE8 - no joy with T-E
Comment 14 Remy Maucherat 2009-04-29 10:16:59 UTC
Maybe something could be done when the client advertises the T-E, and drop to C-E if it does not ?
Comment 15 Mark Thomas 2009-09-09 10:00:36 UTC
As suggested in comment 11 I am going to resolve this as WONTFIX. My reasons are: - any change to use T-E is as likely or more likely to cause breakage - patchy browser support means another handful of options would be required to give sys admins a reasonable chance of configuring a working configuration compatible with users and their combination of proxies and/or caches - I believe the complexity this would add to the critical path isn't worth the benefit In my view the tipping point will be when IE supports T-E whether or not it advertises support for it. At that point I would be all for switching to the spec compliant way of doing compression.
Comment 16 Oliver Schoett 2009-09-10 07:17:08 UTC
Created attachment 24245 [details] Disable sending and interpreting ETags (needs to be made into an option) Not fixing this bug makes it impossible to enable gzip compression on public web sites, because IE6 users behind Squid 2.6 and 2.7 proxies will receive broken content: IE6 by default does not allow compression behind a proxy, but Squid 2.6+ will deliver gzipped content that it already has in the cache, and which is not accepted by IE. Squid has implemented the option broken_vary_encoding to work around this, which by default is enabled for servers whose header begins with "Apache". However, this option is buggy (http://www.squid-cache.org/bugs/show_bug.cgi?id=2574), and Tomcat should not require work-arounds by others for its broken behavior. Thus, an option is needed to disable ETags to make public sites work reliably. What needs to be done is contained in the patch, which disables sending and interpreting ETags. This patch (against 6.0.18) has been used successfully in production since February on a German e-commerce site (90 Mill. PV/month). There is no performance impact, because 304 responses are still generated according to the "If-Modified-Since" logic. Unfortunately, I do not know Tomcat well enough to make this a configurable option.
Comment 17 Mark Thomas 2009-09-10 14:43:14 UTC
Since ETag handling is wholly within the DefaultServlet, just add an option to that servlet. You can use the DefaultServlet's readOnly option as a template.