Bug 49391 - mod_disk_cache needs to register a cleanup to kill failed responses
Summary: mod_disk_cache needs to register a cleanup to kill failed responses
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_cache_disk / mod_disk_cache (show other bugs)
Version: 2.5-HEAD
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2010-06-05 12:36 UTC by Graham Leggett
Modified: 2019-03-29 06:05 UTC (History)
0 users



Attachments
Patch to use a pool cleanup to clean up after failure (6.42 KB, patch)
2010-06-05 12:36 UTC, Graham Leggett
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Graham Leggett 2010-06-05 12:36:54 UTC
Created attachment 25531 [details]
Patch to use a pool cleanup to clean up after failure

Currently, mod_disk_cache responds to errors during reading from backend, but is not given the opportunity to clean up after itself when errors occur to the frontend.

Make the cache cleanup use a proper pool cleanup.
Comment 1 Graham Leggett 2010-06-05 13:08:43 UTC
This is probably a little more involved.

Currently, the original code, and the new cleanup code, invalidates the current cached entry on a backend cache failure (and now frontend cache failure), which is too extreme.

The headers and body should be switched in at the same time, not headers first then body. This removes the need to invalidate a previous entry on error.
Comment 2 M8R-cxm5si 2010-07-01 03:52:29 UTC
See many aptmp* files on Solaris webservers too.

Looking at mod_disk_cache.c it appears that:
  1. a tmp file is created, headers written, and tmp file renamed to header file
      i.e. store_headers()
  2. a tmp file is created, some data written
      i.e. store_body() for each bucket brigade
  3.   ... more data written to tmp file
      i.e. store_body()
  4. file_cache_el_final() is called to rename tmp file to data file

If a client terminates a connection (presses STOP in their browser, quite a common thing to happen that should be correctly handled) then it is likely that store_body() will not be called with a APR_BUCKET_IS_EOS() final bucket. Hence file_cache_el_final() will never be called.

It would be undesirable to call file_cache_el_final() in the event the connection was broken because we wouldn't want a partial (invalid) cached object persisting.

It would be undesirable to use a pool call back to call file_cache_errorcleanup() in a pool cleanup if the cache store was successful.

So, suggest add a new member to disk_cache_object_t* dobj, an ENUM with the following states:
  MOD_DISK_CACHE_NOCACHE
  MOD_DISK_CACHE_HEADERCACHED
  MOD_DISK_CACHE_DATAPARTIALCACHED
  MOD_DISK_CACHE_CACHED

Then register a cleanup function with request pool (r->pool). If state is NOCACHE or CACHED then merely return. If state is HEADERCACHED then remove the header. If state is DATAPARTIALCACHED then remove header and tmpfile.

Considering that an aborted connection is common place on any serious server this should have a high priority.
Comment 3 Graham Leggett 2010-07-01 20:52:46 UTC
It's even more complex still.

We need cache updates to be atomic, right now they are not. The idea of having a well-known filename for the header, and a well-known filename for the data is currently bogus, as you can't update both at the same time.

What we need to do is give a unique name to the data, and then store that unique name within the headers file. We are then in a position to write the data body in place at leisure, and when done, we insert that data filename into the well-known header file, and write the header file atomically.

If the connection is aborted for whatever reason before the header is written, we simply blow away the data file in the cleanup, and we're done. After the header is written the cache's work is done, so there is no need for a cleanup.

The order that the hooks are called is not important from mod_disk_cache's perspective, mod_disk_cache is not obligated to write the headers when it receives them. Instead it can set aside the headers and delay the writing of the headers to the point at which we receive the final data and EOS is detected.