This is undoubtably a corner-case, however I've experienced some flexibility issues configuring mod_cache to work well with a combination of mod_proxy and mod_rewrite. Specifically, the issue related to the inability (without resorting to cumbersome URI rewriting kludges) of mod_cache to be enabled/disabled on more than simply the basis of the leading portion of a URI path. Additionally, due to the fact that mod_cache uses the quick_handler hook, it interrupts (if deciding to return cached content) most down-stream modules so that they cannot make decisions about caching or non-caching content. post_read_handler is the obvious exception, however due to the nearly unconditional way in which mod_cache intercepts requests, it's rather non-elegant to resort to intercept and avoidance trickery via post_read_handler algorithms. I do understand that using quick_handler, in the majority of minimal configuration caching needs, is a performance win. Certainly, this could be worked around with subrequests, however I would prefer not to have to deal with the overhead of a subreq on every transaction (which is what would be necessary in _my_ particular case, others may have better solutions). With that being stated, the attached patch to 2.1-HEAD was my solution to this issue. The following are a list of changes, some of which may be beyond the scope of what was necessary and violate various development API integrity rules. If this is the case, I would be happy to remove/alter certain portions (and I'll mention some discomforts I have below as well). Changes: 1. Added two optional hooks, cache_check_enabled and cache_check_disabled: A. cache_check_enabled is run from ap_cache_get_providers in order to determine if a particular uri (or other condition) is cause to enable caching. The default handler for this hook implements the original functionality by iterating the cacheenable list and adding each entry whose left-most portion of the uri path matches. B. cache_check_disabled is run from ap_cache_get_provider in order to dermine if caching should be disabled. The first hook to return DECLINED causes mod_cache to discontinue trying to find a provider. Again, the default handler performs the original functionality by iterating the cachedisable list. In addition, a check_disable hook may return CACHE_DEFER, which results in mod_cache refusing to return cached content if in the quick_handler hook. Instead it tries again from a regular content hook (see below). 2. New optional function: ap_cache_request_enable_provider. Intended to be used by those who hook check_enabled to add a provider name ("type" seems to be the parlance in mod_cache at that level) and optional version number to the list of providers that ap_cache_get_providers() will try to lookup. Using ap_cache_request_enable_provider is a module's way of telling mod_cache to attempt caching. The func name is tad cumbersome, the "request" is only in there to give some indication that it is a per-request call, not a general-use function for enabling providers. Perhaps this should be an optional, because it's functionally identical to a normal API call. If that is the case, then check_enabled and check_disabled shouldn't be optional hooks either. 3. Added a content handler to mod_cache so that it (or others) can choose, selectively, to handle a request _after_ other modules have taken their turn. Particularly useful for mod_rewrite. Additionally, the request handler must be set to "cache-server", which is done automatically if a check_disabled handler returns CACHE_DEFER inside the context of cache_url_handler. mod_rewrite can also enable caching this way by setting the content handler during a rewrite rule. In my case, this is useful for enabling both reverse proxy and caching for requests that meet certain header constraints. 4. Added a new directive "CacheDefer", which when toggled on forces the above behavior (handling from the content_handler) to be the default. This was completely arbitrary, however it provided the functionality I needed and was useful for testing. Obviously, with the above changes this could be done from anywhere. Not crazy about the name either, it is .. non-intuitive for those unfamiliar with the code. 5. The majority of mod_cache.h internals were moved to cache_private.h, due to the fact that there now exist some intentionally public exports. The now highly minimalized mod_cache.h added to $top_srcdir/Makefile.in for the install-include target. All mod_cache related sources that previously referenced mod_cache.h changed to cache_private.h. Might need some dependancy fixups, I didn't go that far. Thank you for your time. I hope this will be of some use. If there are any questions or requested changes, please feel free to let me know (or just have bugzilla do it =P) Jesse Sipprell
Created attachment 13375 [details] fine-grained enable/disable enhancements to mod_cache
httpd-trunk supports the CacheQuickHandler directive, which allows you to run the cache as a normal handler, which means most of the cases described below should now work. Assuming this problem still exists, can you verify that CacheQuickHandler helps the issues below? The attached patch seems to attempt to do a number of things at the same time, which is difficult to review.
Is this going to be backported or available as a single patch I could try to apply to 2.2.14 sources? We have a problem where using mod_cache and mod_rewrite together and enabling Expires header in a php application, mod_cache always returns the same content on subsequent requests no matter which URL was requested from apache. mod_cache seems to only see the index.php in it's cache while rewrite rules route all URLs through this file. Example setup: RewriteEngine on RewriteRule ^(favicon\.ico|robots\.txt) - [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_FILENAME} !-l RewriteRule .* index.php On the first request (cold cache) when we request /real/url.html from the server it returns the right content. On the next request it serves the content from cache as intended. Next we request /another/url.html and the cache simply servers the content of the first url without going through the index.php (which uses PATH_INFO to extract the URL data). I think this is because mod_cache only hashed "index.php" as the URL which had in both cases no query string and it thus handles both different requests as equal. As far as I understood this should be solvable by using "CacheQuickHandler off".
(In reply to comment #3) > Is this going to be backported or available as a single patch I could try to > apply to 2.2.14 sources? > > We have a problem where using mod_cache and mod_rewrite together and enabling > Expires header in a php application, mod_cache always returns the same content > on subsequent requests no matter which URL was requested from apache. mod_cache > seems to only see the index.php in it's cache while rewrite rules route all > URLs through this file. Example setup: > > RewriteEngine on > RewriteRule ^(favicon\.ico|robots\.txt) - [L] > RewriteCond %{REQUEST_FILENAME} !-f > RewriteCond %{REQUEST_FILENAME} !-d > RewriteCond %{REQUEST_FILENAME} !-l > RewriteRule .* index.php > > On the first request (cold cache) when we request /real/url.html from the > server it returns the right content. On the next request it serves the content > from cache as intended. Next we request /another/url.html and the cache simply > servers the content of the first url without going through the index.php (which > uses PATH_INFO to extract the URL data). > > I think this is because mod_cache only hashed "index.php" as the URL which had > in both cases no query string and it thus handles both different requests as > equal. As far as I understood this should be solvable by using > "CacheQuickHandler off". This sounds strange and IMHO should not happen with 2.2.14. Please set your loglevel to debug and provide the error log output for 1. Startup with a cold cache 2. Request /real/url.html 3. Request /real/url.html (from cache) 4. Request /another/url.html
No reply to info request; closing.