|Summary:||New option for filters: run only if there are spare CPU cycles|
|Product:||Apache httpd-2||Reporter:||Dan Harkless <apache-issues>|
|Component:||mod_deflate||Assignee:||Apache HTTPD Bugs Mailing List <bugs>|
Description Dan Harkless 2006-12-05 19:08:35 UTC
It would be nice if mod_deflate had a directive you could use to tell it to only do compression if there are spare CPU cycles available. I'm concerned about turning on mod_deflate because of the increased load on my server, especially if I were to be hit by badly-behaved robots, parallel downloaders, Slashdotting, etc. Defining "spare CPU cycles available" could be a bit tricky, of course. It'd be great to be able to just tell it not to compress if doing so would peg the CPU to 100% (or a user-definable threshhold), but implementing that would be tricky since different content can be more or less CPU-intensive to attempt to compress. An average-case or worst-case guess could be used, but it'd still vary by CPU type and speed as to how much of a dent that overhead would have on available CPU cycles. I suppose it'd have to benchmark itself to have good predictive ability, which would be starting to get pretty complex. A simpler approach could be to have the directive be called, e.g. DeflateIfCPUUsageBelow. If the user specified 'DeflateIfCPUUsageBelow 75', mod_deflate would check to see if current CPU usage were below 75% in order to compress the given content. The user would be left to do their own measurements to see how much overhead mod_deflate uses for compression and thus where to set that threshhold. And perhaps instead of basing the decision on instantaneous CPU usage for the current CPU, it'd make more sense for the directive to work in terms of load averages, although I've always found those somewhat fuzzy and hard to use as a basis for decision-making. In any case, I think this would be a nice option because machines whose bandwidth is constrained enough to make mod_deflate highly desirable are often going to also not be the *fastest* machines in the world.
Comment 1 Nick Kew 2006-12-06 04:52:13 UTC
An interesting suggestion. Are you aware of mod_load_average, which does a similar job for handlers? Your comment about mod_deflate could apply to other filters in a similar manner, and a load_average check could apply in mod_filter. Why isn't there a bugzilla entry for mod_filter? The difficulty here (as in mod_load_average) is a cross-platform way to define load. Also, bear in mind mod_cache for your own purposes.
Comment 2 Ruediger Pluem 2006-12-06 11:36:56 UTC
I agree with Nick. We should aim for a general solution here that could be used via mod_filter. In my experience CPU usage is only usable if you use an average value of CPU usage over a reasonable amout of time. Otherwise you just get unreasonable flip flops. Thats why I would regard load average as more reliable. But as Nick pointed out there is a problem to define and measure load platform independently. BTW: Shouldn't we move this discussion to dev@httpd? I think continuing this discussion here is somewhat pointless.
Comment 3 Dan Harkless 2006-12-06 12:20:35 UTC
Ah yes, I think I did see a reference to mod_load_average some time back when I was researching Apache throttling options, but I'd forgotten about it. Seems to be undocumented and not supported by its author, though. It's not featured with his other modules on http://www.outoforder.cc/, for instance. Sounds good to make this general-purpose for filters. Thanks for the pointer to mod_cache. It might be good to add a note about it to the mod_deflate documentation, as it wouldn't necessarily be obvious to people who aren't experts on Apache internals that it would cache output from mod_deflate so it wouldn't have to be recompressed next time. I was confused as to how caching would work with mod_deflate (without having a separate caching proxy instance) -- for instance, a lot of the stuff I read online (outside of httpd.apache.org) about it claimed it did its *own* caching, but I couldn't find any evidence of that in the documentation. Yes, good point about "instantaneous CPU usage", a la 'top', being a misnomer, since clearly it has to average over some time period to be meaningful -- it's just a shorter time period than with load averages, without any extra factors thrown in to the calculation, and expressed in terms of an easy to understand 0-100%, rather than load averages which can go arbitrarily high. And yeah, I hadn't really thought about this on Windows -- load averages would be even harder to understand for Windows jockeys.