Bug 56977

Summary: segfaults when using mod_mem_cache + mod_disk_cache on a reverse proxy with mod_proxy + mod_proxy_http
Product: Apache httpd-2 Reporter: bpkroth
Component: mod_mem_cacheAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: RESOLVED LATER    
Severity: major CC: bpkroth
Priority: P2 Keywords: MassUpdate
Version: 2.2.22   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: A test PHP file to help in reproducing the crash.
The coredump backtrace output from the reverse proxy.

Description bpkroth 2014-09-12 16:52:31 UTC
Created attachment 32013 [details]
A test PHP file to help in reproducing the crash.

We have an application (Moodle 2.6) that switches from sending "Cache-Control: public" to "Cache-Control: private" headers for the same content depending upon whether or not the user is logged in.

That application is served via a reverse proxy using mod_proxy/mod_proxy_http to backends running mod_php5/mod_xsendfile (among others).

Traditionally we've also allowed the use of mod_disk_cache on the reverse proxy.

Recently we also added mod_mem_cache *before* the mod_disk_cache confs so that highly requested content could be served more rapidly from the in memory cache:
<IfModule mod_mem_cache.c>
CacheEnable mem /
</IfModule>
<IfModule mod_disk_cache.c>
CacheEnable disk /
</IfModule>

It was discovered that for that content that switches between public and private Cache-Control headers, when the request is made for content that is in the cache (originally via public Cache-Control headers), but has expired so that a revalidation request is required, then the Apache process will segfault.

Based on the backtrace from the coredump, it appears that the segfault is while mod_mem_cache is attempting to remove the now stale (since it now has Cache-Control: private) entry from the cache.

If we remove mod_mem_cache from the mix, then this does not occurr.  mod_disk_cache appropriately stores, serves, and then removes the cache entry as necessary.

Attached are a test php script (proxycaching-index.php) and proxy coredump backtrace output (reverse-proxy-backtrace).

It was also reported to me, though I could not reproduce this other error case, that aside from segfaults, sometimes the mod_mem_cache configuration would just return "garbage data".  My guess would be that it returned partial data or data from an alternative Vary.

Also, I wasn't able to reliably reproduce this bug with content < MCacheMaxObjectSize or without the "Accept-Ranges: bytes" header.  I suspect that those aspects are also tied up in the issue.

So, to test this I did the following:
- setup a reverse proxy with mod_mem_cache and mod_disk_cache (in that order).
- place proxycaching-index.php in /proxycaching/ of a vhost.  Make sure $private = 0 at the top of the file
- add "XSendFile On" in that backend vhost's .htacces (just a convinience for sending the file)
- copy /usr/share/cups/data/default-testpage.pdf (or someother pdf) into /proxycaching/default-testpage.pdf.  It may or may not be important that the size of that file is > MCacheMaxObjectSize
- run the following a few times to prime the cache in each worker mpm process with "public content":
# curl -v -s -o /tmp/curl.out http://vhostaddress/proxycaching/index.php; file /tmp/curl.out; ls -l /tmp/curl.out
- switch $private = 1 in the /proxycaching/index.php file
# rerun the curl command a few times

Let me know if you need any more details.

Thanks,
Brian
Comment 1 bpkroth 2014-09-12 16:53:00 UTC
Created attachment 32014 [details]
The coredump backtrace output from the reverse proxy.
Comment 2 Ruediger Pluem 2014-09-12 17:47:05 UTC
Honest opinion: Don't use mod_mem_cache. It does not speed up things compared to mod_disk_cache. mod_mem_cache's cache is not shared between different httpd processes. So you waste more memory for getting less performance. Given that you have enough memory in your server mod_disk_cache content is kept in the buffer caches by the OS. If you don't use SSL stuff is send via sendfile which moves stuff from the buffer caches to the socket directly inside the kernel. If you are using SSL stuff will need to be MMAP which is still very fast.
Comment 3 bpkroth 2014-09-12 18:15:02 UTC
(In reply to Ruediger Pluem from comment #2)
> Honest opinion: Don't use mod_mem_cache. It does not speed up things
> compared to mod_disk_cache. mod_mem_cache's cache is not shared between
> different httpd processes. So you waste more memory for getting less
> performance. Given that you have enough memory in your server mod_disk_cache
> content is kept in the buffer caches by the OS. If you don't use SSL stuff
> is send via sendfile which moves stuff from the buffer caches to the socket
> directly inside the kernel. If you are using SSL stuff will need to be MMAP
> which is still very fast.

Yeah, I don't disagree.  Under very high load there is a difference between mod_mem_cache and mod_disk_cache in so far as the latter requires some extra syscalls to the OS for file permissions and handles and the like, which can be particularly expensive in a VM environment, but that is kind of an edge case.
Comment 4 Eric Covener 2014-09-12 18:34:41 UTC
(In reply to bpkroth from comment #1)
> Created attachment 32014 [details]
> The coredump backtrace output from the reverse proxy.

I think this crash is the same as my recent question in dev@httpd thread "mod_cache/mod_mem_cache questions" about the difference between remove_url and remove_entity.
Comment 5 Yann Ylavic 2014-09-12 18:59:57 UTC
(In reply to bpkroth from comment #3)
> (In reply to Ruediger Pluem from comment #2)
> > Honest opinion: Don't use mod_mem_cache. It does not speed up things
> > compared to mod_disk_cache. mod_mem_cache's cache is not shared between
> > different httpd processes. So you waste more memory for getting less
> > performance. Given that you have enough memory in your server mod_disk_cache
> > content is kept in the buffer caches by the OS. If you don't use SSL stuff
> > is send via sendfile which moves stuff from the buffer caches to the socket
> > directly inside the kernel. If you are using SSL stuff will need to be MMAP
> > which is still very fast.
> 
> Yeah, I don't disagree.  Under very high load there is a difference between
> mod_mem_cache and mod_disk_cache in so far as the latter requires some extra
> syscalls to the OS for file permissions and handles and the like, which can
> be particularly expensive in a VM environment, but that is kind of an edge
> case.

A good alternative is also to use mod_disk_cache on a directory which a (mounted) ramdisk cache.
Comment 6 bpkroth 2014-09-12 19:22:15 UTC
(In reply to Yann Ylavic from comment #5)
> (In reply to bpkroth from comment #3)
> > (In reply to Ruediger Pluem from comment #2)
> > > Honest opinion: Don't use mod_mem_cache. It does not speed up things
> > > compared to mod_disk_cache. mod_mem_cache's cache is not shared between
> > > different httpd processes. So you waste more memory for getting less
> > > performance. Given that you have enough memory in your server mod_disk_cache
> > > content is kept in the buffer caches by the OS. If you don't use SSL stuff
> > > is send via sendfile which moves stuff from the buffer caches to the socket
> > > directly inside the kernel. If you are using SSL stuff will need to be MMAP
> > > which is still very fast.
> > 
> > Yeah, I don't disagree.  Under very high load there is a difference between
> > mod_mem_cache and mod_disk_cache in so far as the latter requires some extra
> > syscalls to the OS for file permissions and handles and the like, which can
> > be particularly expensive in a VM environment, but that is kind of an edge
> > case.
> 
> A good alternative is also to use mod_disk_cache on a directory which a
> (mounted) ramdisk cache.

This is a little off topic from what the bug was actually about (errors in mod_mem_cache), but I'll bite.

Someone correct me if I'm wrong, but using tmpfs/ramdisk won't avoid your file open(), read()/sendfile(), write(), etc. syscalls from going through the OS to access the cache files instead of just staying in the Apache process space when doing cache lookups.  I believe that context switch is what accounts for the performance difference between mod_mem_cache and mod_disk_cache.

As Ruediger pointed out, if you have enough memory free, then the OS is already going to do a good job of caching the dirents, inode and data blocks in the page cache anyways, so you shouldn't be seeing any major read performance differences between mod_disk_cache and mod_mem_cache aside from those calls to do the lookups and get handles on the file.  Even write performance to the cache shouldn't be too bad given the OS will probably buffer that too and write it out to disk in the background.

But I guess the only way to know for sure would be to test it :)


All that said, you don't have to convince me not to use mod_mem_cache anymore.  Consider this just a heads up that it's broken in some more edge cases.  Perhaps a warning to all future users who run across it :)

Cheers,
Brian
Comment 7 William A. Rowe Jr. 2018-11-07 21:09:53 UTC
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd.

As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd.

If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question.

If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with.

Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.