Summary: | mod_cache sometimes serves old content despite having fetched new content | ||
---|---|---|---|
Product: | Apache httpd-2 | Reporter: | moog <moog> |
Component: | mod_cache | Assignee: | Apache HTTPD Bugs Mailing List <bugs> |
Status: | RESOLVED LATER | ||
Severity: | normal | Keywords: | MassUpdate |
Priority: | P2 | ||
Version: | 2.2.8 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: |
Patch to store body before headers
patch to allow multiple passes better comments iterate over buckets in a brigade to find eos |
Description
moog
2008-03-27 14:25:28 UTC
Created attachment 22429 [details]
Patch to store body before headers
The race condition happens when there are multiple http workers,
After the first request detects that the cache has expired, it tries to update the cache in cache_save_filter.
Here, the storing of the new data takes place in two steps, first the headers are stored, and then the body.
If a request comes to a second http worker when the headers are already written but the body is not, the second http-worker assumes that the cached body is valid (since the headers are valid). This causes the second worker to return the stale cache.
The fix is the switch the order of writing the cache so that body is written first. With this, if there is a second request before headers are stored, the second http-worker only assumes that the cache is stale and fetches it again,
and overwrites the cache again.
The attached patch does not fix all cases: 1. store_body is no atomic operation. The content of the entity may be split across several brigades. So the race would be only fixed if the headers would be stored *after* a brigade containing an EOS bucket was passed to store_body. Keep in mind that store_body is called at 2 different positions in the CACHE_SAVE filter: One is used during the first pass of the filter and the other one is used during the following passes. But it is not guaranteed that more than one pass ever happens (e.g. if the first brigade already contains the EOS bucket). 2. There is still a second race: Lets assume the entity is still fresh if a request does not contain additional conditions. Now we have a request that requests a fresh response e.g. via max-age=0. Even if the patch works as above there is the possibility that a second client got the old headers decides the entity is fresh enough for him and then tries to fetch the body which is now the new one. Created attachment 22433 [details]
patch to allow multiple passes
Thanks for pointing these out,
for the first one, I have modified the patch to look for APR_BUCKET_EOS before
commiting the headers. (at both points)
I do not understand the second issue,
if we have a fresh entitiy 1:head/url 1:body/url in our cache,
and Request A comes in with max-age = 0,
as it proceeds, the body gets written first, so we now have 1:head/url 2:body/url
Now assuming that at this point Request B comes in, and reads the headers
1:head/url but since it does not have max-age=0, it decides it can make do
with the current body, but the body is 2:body/url
However, there is no harm in serving a newer response? we only have a problem
with serving stale content, and 2:body/url is not stale.
Created attachment 22434 [details]
better comments
(In reply to comment #3) > Created an attachment (id=22433) [details] > patch to allow multiple passes > > Thanks for pointing these out, > for the first one, I have modified the patch to look for APR_BUCKET_EOS before > commiting the headers. (at both points) Thanks for the patch. Some comments: 1. I know that mod_disk_cache does the same, but I am a bit little worried that there might be brigades that contain an EOS bucket which is *not* the last bucket of the brigade. Yes this means we would need to iterate over the whole brigade each time to find out if there is an EOS bucket in the brigade or not. This increases effort and lowers performance. 2. If you move cache->info = info; before the first if (cache->stale_handle) { you can merge both if (cache->stale_handle) { together in one block. > > I do not understand the second issue, > > if we have a fresh entitiy 1:head/url 1:body/url in our cache, > and Request A comes in with max-age = 0, > as it proceeds, the body gets written first, so we now have 1:head/url > 2:body/url > > Now assuming that at this point Request B comes in, and reads the headers > 1:head/url but since it does not have max-age=0, it decides it can make do > with the current body, but the body is 2:body/url > > However, there is no harm in serving a newer response? we only have a problem > with serving stale content, and 2:body/url is not stale. > 2:body is not stale, but 1:headers and 2:body might not match together, e.g. regarding the Etag or Content-MD5 headers. Also Range requests could deliver false results in this case. Created attachment 22441 [details]
iterate over buckets in a brigade to find eos
For the second race condition, Using remove_url the first time in
cache_store_content should work, but I am concerned that remove_url is not
light weight enough. (It deletes the headerfile, datafile and the directories
if empty, while what I require is to just remove headers before the caching is
started.)
What would you suggest? Is there a better way? or would it be better to add a 'remove_headers' method to the cache provider interface?
(In reply to comment #6) > Created an attachment (id=22441) [details] > iterate over buckets in a brigade to find eos Thanks. > > For the second race condition, Using remove_url the first time in > cache_store_content should work, but I am concerned that remove_url is not > light weight enough. (It deletes the headerfile, datafile and the directories > if empty, while what I require is to just remove headers before the caching is > started.) > > What would you suggest? Is there a better way? or would it be better to add a > 'remove_headers' method to the cache provider interface? > I guess I was partly wrong with this race condition. It is not that large as I anticipated first, because during opening the entity in open_entity where the headers are read the fd for the datafile is also opened. As we do a move later on to move the new body from the temporary file to the new datafile location this should not harm. The open fd still references to the old body datafile and the old body datafile should be gone after all fds to it should be closed. But of course there is still a possibility that the body file switches when the old headers are read in open_entity and before opening the old datafile. I currently have no good idea how to avoid this race. Calling remove_url the first time in cache_store_content is IMHO a bad idea as this can have severe performance implication if the delivery of the new content takes a long time for whatever reason (slow client, slow backend). So this is no way to go. Undo spam change Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd. As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd. If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question. If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with. Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated. |