We're experiencing a problem with SSIs in both Apache 2.0.44 and 2.0.46. I have posted this issue to the apache users group twice to see if anyone else has run into this same problem but I never heard any response. I also searched through bugzilla for any bugs that might be similar. Initially, we noticed that, on certain pages on our site, SSI directives would stop being processed about halfway through the page. I began looking into the problem by checking our syntax, looking for any wierd characters in a hex editor and then finally by making small changes to the file to see if it affected the service of the page. The first oddity that I came across was that if I opened up the problematic file and inserted a single space at the beginning, the SSI directives would all be correctly processed. If I again removed the space, the problem came back..about halfway down the page, after a block of SSI conditionals, the rest of the SSIs were not processed and the SSI tags were sent to the browser. After a little more investigation, I noticed that the problem seemed to always manifest itself when the last character of an SSI conditional block (e.g. the final > in <!--#endif-->) occurred at the 8000th byte in the file . To test this out, I created a completely new test file, ensuring that the final character in a conditional SSI block was the 8000th byte and I placed another include immediately after the conditional block. Sure enough, the directives in the conditional block were processed successfully but the include after the conditionals was sent to the browser. I then tested this same issue out using a virtual include directive instead of a conditional block and the result was the same. I tried my hand with the mod_include source and added some debug statements but, honestly, I'm not much of a C coder and I'm not terribly familliar with the internals of Apache. I have attached two versions of my test file, one that works (test_working.html) and one that doesnt (test_broken.html). The only difference between these two files is a space at the very beginning. Additional platform and version information is listed at the bottom of this description. Please let me know if you need any additional information. Thanks. Kevin OS: Debian Linux 3.0, 2.4.18 kernel Platform: Intel x86 Apache version: 2.0.44, 2.0.46
Created attachment 6976 [details] working SSI file
Created attachment 6977 [details] problematic SSI file
Thanks for your detailed description! Will dig into the code.
Hmm. That is really strange. I cannot reproduce the situation. I'm sure I saw such a situation earlier (but never found enough common issues). I've tested now 2.1.0-dev and 2.0.47-dev. Do you have the possibility to compile directly from CVS and check it with your data? Does debian use any specific patches? (Thom?)
I'll rebuild from cvs today.
I just grabbed 2.0.47-dev and built it. Still seeing the same problem.
hmm. what says `httpd -l`? Can you post your complete config (or mail me privately if you don't want to publish it)?
Nope, no additional patches for SSI for the Debian packages. I'll see if I can replicate with the current packages and also with the .47 tag next week.
In order to spare you the pain of wading through our rewrite-and-redirect- laden config file, I worked on replicating the bug from a vanilla apache install. I did a default install and made changes to the config file only to enable SSI. The bug did not manifest itself in this situation. So I began looking at the settings in our custom config and experimenting by making similar changes to the vanilla install. One thing that I failed to mention when I posted this bug is that our document root is an NFS volume. Thus in our config, as recommended, we turned off sendfile and mmap support. So I tried turning memory mapping off (EnableMMap off) but left sendfile support on and was then able to successfully replicate the erroneous SSI behavior. To further narrow the scope of this error, I tried the same config but switched the document root to a non-NFS volume and I still encountered the same problem. So it would appear that this error occurs only when memory mapping is turned off. Andre: I will be mailing you the config I have been working with in a few minutes.
Well, thanks again for your deep investigation. Received your config and will re-try to reproduce now ;-)
While debugging a similar problem, the following patch solved that problem for me: Index: modules/filters/mod_include.c =================================================================== RCS file: /home/cvs/httpd-2.0/modules/filters/mod_include.c,v retrieving revision 1.233 diff -u -r1.233 mod_include.c --- modules/filters/mod_include.c +++ modules/filters/mod_include.c @@ -477,7 +477,7 @@ if (len) { - pos = bndm(str, slen, c, len, ctx->start_seq_pat); + pos = bndm(str, slen, buf, len, ctx->start_seq_pat); if (pos != len) { ctx->head_start_bucket = dptr; Can you apply it and see if it changes something for you? Thanks.
I applied that patch but I'm still getting the same error.
Well. It seems, we've got it now. The following patch should solve the problem for you. Yet another test with your data would be appreciated :-) Index: modules/filters/mod_include.c =================================================================== RCS file: /home/cvs/httpd-2.0/modules/filters/mod_include.c,v retrieving revision 1.233 diff -u -r1.233 mod_include.c --- modules/filters/mod_include.c +++ modules/filters/mod_include.c @@ -429,7 +429,7 @@ } if (len == 0) { /* end of pipe? */ - break; + continue; } /* Set our buffer to use. */ @@ -600,7 +600,7 @@ } if (len == 0) { /* end of pipe? */ - break; + continue; } if (dptr == ctx->tag_start_bucket) { c = buf + ctx->tag_start_index;
argh! DO NOT USE the previous code. It produces a loop. Sorry. Here comes the right patch: Index: modules/filters/mod_include.c =================================================================== RCS file: /home/cvs/httpd-2.0/modules/filters/mod_include.c,v retrieving revision 1.233 diff -u -r1.233 mod_include.c --- modules/filters/mod_include.c +++ modules/filters/mod_include.c @@ -429,7 +429,8 @@ } if (len == 0) { /* end of pipe? */ - break; + dptr = APR_BUCKET_NEXT(dptr); + continue; } /* Set our buffer to use. */ @@ -600,7 +601,8 @@ } if (len == 0) { /* end of pipe? */ - break; + dptr = APR_BUCKET_NEXT(dptr); + continue; } if (dptr == ctx->tag_start_bucket) { c = buf + ctx->tag_start_index;
Just applied that patch and the problem appears to be solved! Thank you so much for your help. Do you think that this patch would be safe to apply to the 2.0.46 source (our production setup)? And if it's possible, could you give me a short description of the solution? It would be much appreciated. Thanks again!
Andre: you read my mind about the bucket_next() thing. I was just about to point that out. :) An alternative would be to do: if (len == 0) { /* end of pipe? */ - break; + apr_bucket *next_dptr = APR_BUCKET_NEXT(dptr); + apr_bucket_delete(dptr); + dptr = next_dptr; + continue; } OH! But hang on. There's a bug in both cases: if (dptr) becomes the sentinel we must not continue; we have to break. Otherwise we'll segfault the next time through the loop when we try to call apr_bucket_read() on the sentinel. That is, no doubt, why it was break; before -- a pipe bucket was probably always the last bucket in the brigade when we hit that case during the testing. The code was wrong, but it would have worked in that instance. Try this patch instead: Index: mod_include.c =================================================================== RCS file: /home/cvs/httpd-2.0/modules/filters/mod_include.c,v retrieving revision 1.233 diff -u -d -r1.233 mod_include.c --- mod_include.c 3 Feb 2003 17:53:01 -0000 1.233 +++ mod_include.c 11 Jul 2003 03:03:25 -0000 @@ -429,7 +429,13 @@ } if (len == 0) { /* end of pipe? */ - break; + apr_bucket *next_dptr = APR_BUCKET_NEXT(dptr); + apr_bucket_delete(dptr); + dptr = next_dptr; + if (dptr == APR_BRIGADE_SENTINEL(bb)) { + break; + } + continue; } /* Set our buffer to use. */ @@ -600,7 +606,13 @@ } if (len == 0) { /* end of pipe? */ - break; + apr_bucket *next_dptr = APR_BUCKET_NEXT(dptr); + apr_bucket_delete(dptr); + dptr = next_dptr; + if (dptr == APR_BRIGADE_SENTINEL(bb)) { + break; + } + continue; } if (dptr == ctx->tag_start_bucket) { c = buf + ctx->tag_start_index;
Sigh. It's always something. After hours of debugging, I figured out that deleting the bucket at this point turns out to be a bad idea because it gets deleted again later (or at least will once another mod_include patch we're working on gets committed). So, we compromise. The following version of the patch does what I'd expect it to do given the most rigorous and torturous test case I could contrive for mod_include. So assuming it still fixes your bug (which I can't tell from the attachments you gave whether it does or not for some reason), which it should, then I feel this patch is safe for production use. As soon as I get the confirmation from you, I'll commit it to httpd-2.1-dev and propose it for inclusion in 2.0.48. Index: mod_include.c =================================================================== RCS file: /home/cvs/httpd-2.0/modules/filters/mod_include.c,v retrieving revision 1.233 diff -u -d -r1.233 mod_include.c --- mod_include.c 3 Feb 2003 17:53:01 -0000 1.233 +++ mod_include.c 11 Jul 2003 09:15:58 -0000 @@ -429,7 +429,11 @@ } if (len == 0) { /* end of pipe? */ - break; + dptr = APR_BUCKET_NEXT(dptr); + if (dptr == APR_BRIGADE_SENTINEL(bb)) { + break; + } + continue; } /* Set our buffer to use. */ @@ -600,7 +604,11 @@ } if (len == 0) { /* end of pipe? */ - break; + dptr = APR_BUCKET_NEXT(dptr); + if (dptr == APR_BRIGADE_SENTINEL(bb)) { + break; + } + continue; } if (dptr == ctx->tag_start_bucket) { c = buf + ctx->tag_start_index;
Kevin: the actual error is that a bucket is splitted at the end (i.e. the '>' char at the 8000th byte), leaving a zero byte bucket around that made the len == 0 test success (and thus the matching fail) in the next loop. Cliff: The continue is just a goto to the while condition which breaks the loop already if the next bucket is the sentinel, isn't it?
So my last question would be: why does this bug only show itself if memory mapping is disabled?
Andre: you win. I guess I thought continue; went back to the top of the loop and thus would not catch the condition on a do{}while rather than a while(){}. But K&R tells me it will work your way. Good call. :) Kevin: the reason is that with mmaping enabled, the entire file is slurped in as one big mmap bucket. with mmap disabled, it actually calls apr_file_read() on the file and pulls the data into memory 8000 bytes at a time, one buffer per heap bucket. +1 to the version of the patch that said: if (len == 0) { /* end of pipe? */ - break; + dptr = APR_BUCKET_NEXT(dptr); + continue; } I believe it safe for production.
Thanks for the explanation. So just to be sure: I should be using the initial working patch from Andre for this bug?
Yep! That's the one we'll use.