Created attachment 25472 [details] Prevent filter_init in mod_filter from destroying it's own harness context Currently, mod_filter can't be used as a harness for mod_deflate when a subrequest may be initiated, specifically from mod_include. Attempting to do so causes data in the response to be lost and/or sent uncompressed even though the Content-Encoding header specifies a "gzip" encoding. I originally thought this was related to bug 17629, and comments on that bug which describe similar symptoms indicate that many others do, too. However, the solution I proposed for this issue failed to address the original problem described in that bug, so I no longer think they are related. I don't believe this is related to bug 43939 either, as mod_filter wasn't being used in that case. A repeatable example of this issue follows. The httpd configuration loads mod_filter and mod_include and contains these directives: AddType text/html .shtml AddOutputFilter INCLUDES .shtml FilterDeclare smart_compress CONTENT_SET FilterProvider smart_compress DEFLATE resp=Content-Type $text/ FilterChain smart_compress Contents of example.shtml: <HTML> <BODY> Included text below line <HR /> <!--#include virtual="/includeme.shtml"--> </BODY> </HTML> Contents of includeme.shtml: This text was included Command: ( echo 'GET /example.shtml HTTP/1.1' ; \ echo 'Host: localhost' ; \ echo 'Accept-Encoding: gzip' ; \ echo '' ) | nc localhost 3080 Output: HTTP/1.1 200 OK Date: Sun, 23 May 2010 03:01:39 GMT Server: Apache/2.2.14 (Unix) Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Transfer-Encoding: chunked Content-Type: text/html 17 This text was included 11 </BODY> </HTML> 0 Output should be (as viewed in less): HTTP/1.1 200 OK Date: Sun, 23 May 2010 03:05:12 GMT Server: Apache/2.2.14 (Unix) Last-Modified: Sun, 23 May 2010 02:51:50 GMT ETag: "2cb08c0-68-4873a01821d80" Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 105 Content-Type: text/html ^_<8B>^H^@^@^@^@^@^@^C<B3><F1>^H<F1><F5><B1><E3><B2>q<F2>w<89><B4><E3><F2><CC>K<CE>)MIMQ(I<AD>(QH<CA><C9>/W<C8><C9><CC>K<E5><B2> <F1>^HR<D0>^G<AA>R<D4><D5>U΄<A8>Q(<CB>,*)M̱U<U+0487><8A><E4><A6><EA>^Ug<94><E4><E6>(<E9><EA>^B<95><EA>CL<B4>чX^@^@<B5>r<8E>Gh^@^@^@ mod_filter appears to be the only module that uses the filter_init_func field of the ap_filter_rec_t structure. This member is a pointer to function that is to be run right after the insert_filter hooks are run and right before the content handler is invoked. As most modules do their initialization at the time they are called to process data, this feature is largely unused. However, mod_filter chooses to do its initialization here so that it can correctly call the filter_init_func of each provider if it exists. mod_filter sets filter_init_func to be a pointer to its filter_init function. In filter_init, it first creates the context for the harness, before calling filter_init_func of each provider. The initial context for each provider and the ongoing context of the subsequently invoked provider are all stored in the harness context. The problem arises in that mod_filter intuitively assumes that filter_init will only ever be called once during the lifetime of the filter harness. However, when mod_include creates a subrequest, all of the filters from the main request get copied into the subrequest. Before the content handler is invoked in the subrequest, the filter_init_func of every filter in the subrequest gets called. This means that the filter_init_func of the filter harness gets called again, one addition time for each subrequest. The first thing that mod_filter does in filter_init is to assign the context pointer to freshly allocated memory. When called more than once, this essentially destroys the existing harness context and any provider contexts that it contains. In the case of mod_deflate, it loses any data waiting in the compression buffer. Furthermore, when mod_deflate is invoked again without a context, it assumes it is being called for the first time in that request. When it sees that Content-Encoding is already set to gzip, it removes itself from the filter chain, leaving the remaining output uncompressed. I don't know if it's considered correct or ideal operation that filter_init_func can be called more than once, but as it stands, mod_filter can't assume that it doesn't. I am attaching a small patch that will make filter_init first check for an existing harness context, simply returning if it finds one already exists. I tested this patch with 2.2.14.
I'm sorry. The correct output should look more like: HTTP/1.1 200 OK Date: Sun, 23 May 2010 11:29:29 GMT Server: Apache/2.2.14 (Unix) Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Content-Length: 89 Content-Type: text/html ^_<8B>^H^@^@^@^@^@^@^C<B3><F1>^H<F1><F5><B1><E3><B2>q<F2>w<89><B4><E3><F2><CC>K<CE>)MIMQ(I<AD>(QHJ<CD><C9>/W<C8><C9><CC>K<E5><B2> <F1>^HRз<E3> <C9><C8>,<86>ȕ'^V+dBUsq<D9><E8>C<F4><DB><E8>C<8C>^C^@<A9>W^L<D9>V^@^@^@
I don't think that your patch is correct. What about filters that expect their init function to be called a second time? The following patch should fix that. Could you please give it a try? Index: modules/filters/mod_filter.c =================================================================== --- modules/filters/mod_filter.c (Revision 947412) +++ modules/filters/mod_filter.c (Arbeitskopie) @@ -100,10 +100,16 @@ { ap_filter_provider_t *p; provider_ctx *pctx; + provider_ctx *pctx_head; int err; ap_filter_rec_t *filter = f->frec; + harness_ctx *fctx = f->ctx; - harness_ctx *fctx = apr_pcalloc(f->r->pool, sizeof(harness_ctx)); + /* Only allocate if not done previously */ + if (!fctx) { + fctx = apr_pcalloc(f->r->pool, sizeof(harness_ctx)); + } + pctx_head = fctx->init_ctx; for (p = filter->providers; p; p = p->next) { if (p->frec->filter_init_func == filter_init) { ap_log_cerror(APLOG_MARK, APLOG_ERR, 0, f->c, @@ -112,6 +118,15 @@ } else if (p->frec->filter_init_func) { f->ctx = NULL; + /* + * Check if we already had a context for this filter and if + * yes restore it. + */ + for (pctx = pctx_head; pctx; pctx = pctx->next) { + if (pctx->provider == p) { + f->ctx = pctx->ctx ; + } + } if ((err = p->frec->filter_init_func(f)) != OK) { ap_log_cerror(APLOG_MARK, APLOG_ERR, 0, f->c, "filter_init for %s failed", p->frec->name);
(In reply to comment #2) > I don't think that your patch is correct. What about filters that expect their > init function to be called a second time? The following patch should fix that. > Could you please give it a try? I guess the question is whether filter_init_func _should_ be called more than once. It's description in util_filter.h is: /** The function to call before the handlers are invoked. Notice * that this function is called only for filters participating in * the http protocol. Filters for other protocols are to be * initialized by the protocols themselves. */ I read "before the handlers are invoked" to mean that I should never expect the filter_init_func to be called after the handler has been invoked. I think the best fix to this problem would be to make it so that Apache never calls filter_init_func more than once. However, that might require a change to the request_rec, and thus can only be incorporated into 2.3 or later. So, I would argue: 1) If it's decided that the behavior of calling filter_init_func a second time in a subrequest needs to be preserved for compatibility with existing 2.2 modules, then I would only apply your patch to the 2.2 branch while in trunk eliminating multiple calls to filter_init_func instead. 2) Any filter that relies on it's context being preserved in a subrequest is already broken when used with mod_filter. After fixing mod_filter's context handling, failing to call filter_init_func a second time shouldn't introduce new negative behavior. 3) I can't think of a case where one might actually expect filter_init_func to be called twice. Modules in the wild today that use filter_init_func most likely break in subrequests.
I've added a test case for this, http://svn.apache.org/viewvc?rev=947593&view=rev (will look at the patches shortly)
Created attachment 25550 [details] patch which fixes ap_invoke_filter_init (Foreword: great analysis of both this and bug 17629, Alex - thanks) 1) I don't think it makes any sense for a filter to expect a second "filter_init_func" invocation here merely because a subrequest has been created and changed some independent part of the filter chain. 2) Although there is a vague caveat regarding this in the header, the code will AFAICT re-run the "init" functions for any protocol/connection-level filters for each request and each sub-request therein, which would be kind of insane if it were the intended API. So I would argue we should simply "clarify" the API description here and fix the bug, rather than attempt to work around it. Fixing the bug is fairly simple, patch is attached and fixes the test case without causing any other regressions in the test suite, for what that's worth. I was hoping for an all-in-one solution for both this and bug 17629, by more cleanly separating the subreq's filter chain from the main request_rec, as mentioned in dev@ thread today, but couldn't get that working without regressions.
Created attachment 25554 [details] updated patch to fix ap_invoke_filter_init Finished off version of patch to fix the filter_init process. I'd hope this is uncontroversial for trunk, and will commit soonish unless I hear otherwise. I'll propose for 2.2.x but we could go with either Alex's or Ruediger's patch there instead if there is concern; I guess if there's concern it must be to retain the multi-init "feature" so Ruediger's patch is the way.
(In reply to comment #6) > Created an attachment (id=25554) [details] > updated patch to fix ap_invoke_filter_init > > Finished off version of patch to fix the filter_init process. > > I'd hope this is uncontroversial for trunk, and will commit soonish unless I > hear otherwise. +1 > > I'll propose for 2.2.x but we could go with either Alex's or Ruediger's patch > there instead if there is concern; I guess if there's concern it must be to > retain the multi-init "feature" so Ruediger's patch is the way. I think this is even fine for backporting to 2.2.x as I am now convinced that calling it on a subrequest does not make sense.
Thanks for reviewing, Ruediger. trunk: http://svn.apache.org/viewvc?view=revision&revision=953311
I just wanted to say that we've validated that this patch resolves our issue and are moving to production with 2.2.16. As far as I'm concerned, this bug can be closed.
Alex, given that it's your bug and you're happy with the conclusion, it would've made sense for you to close it. Don't be shy if there's a next time!
Sorry, I disagree. I have verified that the patch is part of 2.2.16, but the bug is not fixed nevertheless: Situation: error.shtml contains: <h1>TEST</h1><!--#include virtual="error.php" --> error.php contains: HELLO Relevant lines from httpd.conf: ScriptAlias /sys-bin/ "/www/sys-bin/" Action php5-script /sys-bin/php5/php AddHandler php5-script php AddType text/html .shtml AddOutputFilter includes;deflate .shtml (When removing ";deflate" from the AddOutputFilter line, the problem is circumvented and the output delivered uncompressed to the browser.) Request: GET /WWWTest/error.shtml HTTP/1.1 Host: www User-Agent: Dummy/1.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: de-de,de;q=0.8,en-us;q=0.5,en;q=0.3 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 115 Connection: keep-alive Response (via Cut&Paste, thus not exactly): HTTP/1.1 200 OK Date: Wed, 13 Oct 2010 13:40:51 GMT Server: Apache Last-Modified: Wed, 13 Oct 2010 13:20:21 GMT Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Content-Type: text/html Keep-Alive: timeout=1, max=500 Connection: Keep-Alive Transfer-Encoding: chunked 6 HELLO 20 ��� ��2��3� 0 Obviously the output of the PHP handler does not get compressed and thus the output is broken.
With 2.2.17 my test case does not produce the bug any longer. Can it be that the fix for Bug 43939 by accident fixed my test case, too?
(In reply to comment #12) > With 2.2.17 my test case does not produce the bug any longer. Can it be that > the fix for Bug 43939 by accident fixed my test case, too? No. It's because you hijacked a *different* bug report that just happened to have similar symptoms to 43939 (and is in fact a much older issue).