Bug 9673 - Conditional GET requests not handled properly with filtered content
Summary: Conditional GET requests not handled properly with filtered content
Status: CLOSED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.0.43
Hardware: All All
: P1 major with 3 votes (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-06-06 19:22 UTC by Christian Kohlsch
Modified: 2004-11-16 19:05 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Kohlsch 2002-06-06 19:22:08 UTC
Apache2 does not pay attention to dynamic PHP scripts configured by  
SetInputFilter/SetOutputFilter PHP, as it still checks the modification  
date of the plain .php file.  
  
This leads to caching problems, when the user's browser sends a  
"If-Last-Modified" request. Apache returns "304 Not Modified", as the 
PHP file itself has not changed, although the dynamic content may have 
changed! 
 
The problem is in modules/http/http_protocol.c, line 386+ 
Hotfix: I have commented out line 393 (return HTTP_NOT_MODIFIED), so that this 
feature is totally disabled. 
 
Is there a chance to check for an Input/OutputFilter set? That way, we could 
deactivate the feature for PHP pages only.
Comment 1 Joshua Slive 2002-06-06 19:29:54 UTC
I'm fairly sure that it is the filter itself that should be removing the
last-modified.  There are many filters that do not affect caching, so
apache should not be removing last-modified for every filter.

So, the punch line is, you should file this bug report with PHP.
Comment 2 Christian Kohlsch 2002-06-06 20:50:26 UTC
The problem is that the PHP filter is not called. Here is a dump of the 
client-server communication: 
 
---------> from Client to Server 
GET / HTTP/1.1 
Accept: */* 
If-Modified-Since: Thu, 06 Jun 2002 17:47:44 GMT; length=17947 
Host: www.newsclub.de 
Connection: Keep-Alive 
 
<-------- Reply from Server to Client 
 
HTTP/1.1 304 Not Modified 
Date: Thu, 06 Jun 2002 17:50:08 GMT 
Server: Apache/2.0.36 (Unix) DAV/2 PHP/4.2.1 
Connection: Keep-Alive 
Keep-Alive: timeout=15, max=100 
ETag: "66ab2-4ef-63a14340" 
 
--------- End of reply, that's it - no content! 
 
So, there is neither PHP output nor the PHP source code at all, Apache seems 
to block further processing of the request. 
 
Comment 3 Joshua Slive 2002-06-06 20:56:09 UTC
Still a PHP problem.

Content is never returned on a 304 response.  That is the point of the response.
What is happening is that Apache is running the request, seeing the 
last-modified information, and deciding that the client already has
up-to-date content, so it doesn't need to send it again.

It is the responsibility of the PHP filter to
remove the last-modified information so that apache always serves
fresh content.
Comment 4 Joshua Slive 2002-06-06 21:04:05 UTC
Oops.  I take it all back.

I just tried with mod_include's INCLUDES filter, and even though it strips
Last-Modified, Apache still serves conditional GET requests.  Ouch.

Here's an example:
ab26[joshua]59% telnet httpd.apache.org 80
Trying 63.251.56.142...
Connected to httpd.apache.org.
Escape character is '^]'.
GET /docs/vhosts/index.html HTTP/1.1
Host: httpd.apache.org
If-Modified-Since: Thu, 06 Jun 2002 20:58:16 GMT

HTTP/1.1 304 Not Modified
Date: Thu, 06 Jun 2002 21:01:29 GMT
Server: Apache/2.0.37-dev (Unix)
ETag: "ea72a-b3f-f3b22680;2434f440"
Content-Location: index.html.en
Vary: negotiate,accept-language,accept-charset

Comment 5 Joshua Slive 2002-06-06 21:12:42 UTC
And what the heck is that ETag doing in the 304 response?  It is not in
there for a normal response:
HEAD /docs/vhosts/index.html HTTP/1.1
Host: httpd.apache.org

HTTP/1.1 200 OK
Date: Thu, 06 Jun 2002 21:09:12 GMT
Server: Apache/2.0.37-dev (Unix)
Content-Location: index.html.en
Vary: negotiate,accept-language,accept-charset
TCN: choice
Accept-Ranges: bytes
Content-Length: 3158
Content-Type: text/html
Content-Language: en

And what the heck is the Content-Length doing in the normal response?

Both Content-Length and ETags are explicitly unset by mod_include.

I'm afraid I'm out of my league here, so I'll need to leave this to others.
Comment 6 Cliff Woolley 2002-06-06 21:13:58 UTC
The If-Modified-Since and If-Unmodified-Since logic uses r->mtime as the time of the entity being served, not the Last-Modified header.  I assume this is an optimization to avoid reparsing Last-Modified.  So either stripping Last-Modified is not enough (also set r->mtime to 0), or we need to change the logic on line 316 of http_protocol.c.  Note though that this is also how Apache 1.3 deals with it, so I'm not sure I see what the problem is.  Sounds like a bug (misunderstanding?) in mod_include and mod_php4.  --Cliff 
Comment 7 Cliff Woolley 2002-06-06 21:18:10 UTC
The Content-Length is probably being generated by the content length filter 
(that's its job).  Not sure about the etag, but I do remember some discussion 
about that recently I think... 
Comment 8 Christian Kohlsch 2002-06-06 21:25:15 UTC
As I understand, the problem is the filter concept itself.    
Filters are applied on data that is already processed by http_protocol.c    
    
What http_protocol.c checks is the modification time of a file, ie.    
"index.php". The script has been written in 2001, for example, but it produces    
new content from a database every minute.    
    
So, the PHP code gets only executed, if Apache decides to pass the data to the    
filter. Here Apache decides not to, because it thinks that the file has not    
changed.    
    
The ETag is probably generated because Apache looks at the program code of    
index.php itself, not at the content the PHP script produces.    
   
I suggest to add an Apache directive to disable If-Modified-Since processing   
for Filtered files, for example:   
   
<FilesMatch "\.php$">   
    SetInputFilter PHP   
    SetOutputFilter PHP   
    CheckIfModifiedSince off   
</FilesMatch>   
   
PHP itself could then automatically set this directive.   
  
Comment 9 Joshua Slive 2002-06-06 21:28:49 UTC
I disagree.  It should be the filter's responsibility to decide if it
modifies the data, not the administrator.  If the filter wants to
deligate that job to the administrator (like mod_include does with
XBitHack full) then it can do so.
Comment 10 Christian Kohlsch 2002-06-06 21:35:38 UTC
But somehow, http_protocol.c must get that information. 
 
Is there a way to check for a certain filter in http_protocol.c? 
That would enable me to write a quick hack for PHP. 
 
Currently, I have disabled "not modified" replies at all, which is 
no good idea... 
 
Comment 11 Justin Erenkrantz 2002-06-06 21:45:21 UTC
This is due to a bug in PHP and in the httpd-2.0 core.

default_handler shouldn't be calling ap_meets_condition().  As I posted to
dev@httpd, this decision should most likely be delayed until the
ap_http_header_filter() is called.

However, PHP's use of filters is completely and totally broken in many ways.  =)
 First off, it doesn't unset Last-Modified which it needs to do.  There's a lot
of bogosity in how it deals with buckets.  I've posted before to php-dev@ about
this.
Comment 12 Christian Kohlsch 2002-06-06 21:58:25 UTC
Can someone give me a hint how to write a quick fix for it? 
I would just need to check if PHP is in the filter chain. 
 
Comment 13 Cliff Woolley 2002-06-06 22:02:04 UTC
That's easier said than done.  Easier is the seemingly "right" fix that Justin 
proposed: move the ap_meets_conditions() call from the default_handler to the 
ap_http_header_filter. 
 
--Cliff  
Comment 14 Daniel Eckl 2002-07-20 15:14:56 UTC
I'd like to post a workaround without patching apache or PHP....

Just edit your script(s) to send a 'header("Last-Modified: Mon, 26 Jul
1997 05:00:00 GMT");' or just some other date older than the mdate of
your script file. This solves the problem.

Reason:
The bug causes Apache2 to look for the mdate of the .php file to
determine if it has been modified.
If the browser first gets a header like above, it next time asks for the
page with an 'If-Modified-Since: Mon, 26 Jul 1997 05:00:00 GMT'. Then,
the httpd looks at the mdate of your script, which is always newer and
says: Yes, it has been modified, "200 OK". The script will be served and
it will response again with the header line from above. Round and round
the story goes. :))

Greets, and have fun!

Daniel
Comment 15 Christian Kohlsch 2002-07-21 11:34:20 UTC
Daniel: Your workaround solves the problem, but creates another one. 
Searching engines that spider the pages would see 'old' Last-Modified headers 
and therefore will not index them. 
 
At NewsClub.de, the pages are updated at least every 30 minutes, so this 
fix is at least not good for me. 
 
Maybe the bug is even fixed in the current CVS versions of apache2 and php4? 
 
Comment 16 Daniel Eckl 2002-07-23 13:14:16 UTC
Hmm, the actual Changelog contains:

  *) Add a filter_init parameter to the filter registration functions
     so that a filter can execute arbitrary code before the handlers
     are invoked.  This resolves a problem where mod_include requests
     would incorrectly return a 304.  [Justin Erenkrantz]

Is this the solution to our problem?
I'm compiling httpd-2.0_20020723101312 at the moment and will try it.
Comment 17 Daniel Eckl 2002-07-24 05:20:26 UTC
I'm sorry, problem not solved yet using the latest unstable cvs code of both 
php4 and httpd-2.0.

...but they are compiling and running fine :))))
Comment 18 Justin Erenkrantz 2002-09-29 16:30:34 UTC
This should be fixed in Apache 2.0.42 and later.

Unknown when exact fix was committed (probably much earlier than 2.0.42), but
the behavior works as expected for filtered content now.
Comment 19 Erlend Stromsvik 2002-12-20 10:02:54 UTC
I'm still having this bug, even with the 'short fix' Daniel Eckl posted. :-/

Running Apache 2.0.43 for Windows with php-4.2.3-Win32.
Comment 20 Sascha Kulawik 2002-12-20 10:12:09 UTC
Try PHP 4.3.0 BETA, it seems to be working now.
Comment 21 stephen fox 2002-12-23 13:23:08 UTC
This bug has been closed but appears to still persist.  I'm running Apache 
2.0.43 with PHP 4.3 and I still have this issue.  Has it been decided wether 
this is legitimately a bug in Apache or PHP yet?  I have heard it thrown around 
as a PHP bug by Apache and an Apache bug by PHP and was wondering if either 
side has given in and said where the bug really is.

Thank you,
Steve
Comment 22 Daniel Eckl 2002-12-25 23:50:05 UTC
The problem can be worked around in php.

See http://bugs.php.net/bug.php?id=17098

Daniel
Comment 23 Justin Erenkrantz 2003-02-16 23:48:59 UTC
Yes, the filter_init hook resolved this.  See how mod_include and mod_php (in their CVS tree) do it.  It should be included in the next PHP release if it isn't already, but there isn't anything more we can do with this issue.  The API is there.

Thanks for using Apache HTTP Server!