Bug 63282 - Cache keys incorrect on rewrite
Summary: Cache keys incorrect on rewrite
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_cache_disk / mod_disk_cache (show other bugs)
Version: 2.4.38
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-24 18:33 UTC by Adam
Modified: 2021-05-03 19:20 UTC (History)
2 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Adam 2019-03-24 18:33:31 UTC
Hello all,

I've been working at this for a bit, but unfortunately been unable to find a proper solution either in docs, trial-and-error, or the mailing list. Honestly, I believe this might be a bug anyway, so I thought I would reach out here and see what you all think. I've also looked in past reports and wasn't able to find previous mentions of this, so I apologize if I missed something and this has been reported before.

The main issue is that the Apache disk cache (specifically; I haven't tried other caching backends on this) will cache requests with the rewritten URL when said request was rewritten by mod_rewrite, rather than the originally-requested resource. This is mainly evident on CMS's that use so-called "pretty URL's" or "permalinks", where all requests are rewritten to "index.php" for processing. So, if I request, say:

http://domain.com/page/1

This gets rewritten internally to http://domain.com/index.php and served normally. This results in the cache key showing not the original URI, but the rewritten one. Such as (when viewed with htcacheclean):

http://domain.com/index.php? 942 20436 200 0....etc....

This means that, should I attempt to request, say, http://domain.com/page/2, this also gets rewritten to index.php, which causes the cache to erroneously think it has my request in the cache and serves that instead of properly realizing that the request is different.

I would think that expected behaviour of the cache would be to store the request under the key as the originally-requested resource rather than whatever it may  get rewritten to internally (redirects and such notwithstanding). Making the above instead appear as:

http://domain.com/page/1? 942 20436 200 0....etc....

I hope this makes sense?

Thank you in advance for your help!

- Adam
Comment 1 Raphaël Droz 2020-06-15 21:26:28 UTC
We can think it like the opposite of https://bz.apache.org/bugzilla/show_bug.cgi?id=21935

While #21935 is about using the rewritten query (in order to use one deduplicated cache entry for multiple query-string), this is about not merging distinct client URI rewritten as an identical final path.

Someone provided a neat workaround http://mrclay.org/2013/09/13/apache-mod_cache-and-mod_rewrite-danger/ but it's not always suitable nor desirable.

Having a better control about what the final cache entry looks wrt both URI and query-string would be definitely nice to have!
Comment 2 asf 2021-05-03 19:20:03 UTC
Just a suggestion -- maybe instead of finding one golden bullet solution fitting all possible cases, just give admins the possibility to decide? A (vhost level?) setting roughly similar to "UseCanonicalName", that would make cache indexed by either rewritten or original URL?
In my case, overgrown cache may be inconvenient, but the one that serves same content for different requests is of no use...