While troubleshooting a crash that occurs while shutting down Aache 2.0.47 on Windows XP and Win 2000 I came up with a check that prevents the crash, though it doesn't by any means solve the bigger issue described in issue 20462. Since my change involves APR I thought I should create an issue for it showing the component as APR. My description of problem before seeing issue 20462: While apache is shutting down a write to a file fails because the mutex value has been cleaned up. It then tries to log this error to an error log file but the file handle has been invalidated by this time. The fix adds a simple check for a valid file handle before trying to call the function apr_thread_mutex_lock () which needs a valid file handle and mutex or it crashes. File: srclib/apr/file_io/win32/readwrite.c Function: apr_file_write Starting at line 307 I added this check and modifed existing function call to check the return code. if (thefile->filehand == (HANDLE)0 || thefile->filehand == (HANDLE)-1) { return (APR_OS_START_SYSERR + ERROR_INVALID_HANDLE); } /* apr_file_lock will mutex the file across processes. * The call to apr_thread_mutex_lock is added to avoid * a race condition between LockFile and WriteFile * that occasionally leads to deadlocked threads. */ rc = apr_thread_mutex_lock(thefile->mutex); if (rc != APR_SUCCESS) { return rc; } If you think this is useful then I'd be happy to help out in any way I can. David Blake dblake@hp.com
Created attachment 9629 [details] Patch file for apr_file_write() fix for bug 25575
I attached the fix in <patch> file format. I'm still reading and learning how to submit fixes and discuss issues here and hopefully have this right.
So apr_file_write() is being called by Apache after the file has been cleaned up? The normal apr philosophy would be to let it segfault and Apache should fix its bug (using an apr_file_t after it has been cleaned up). What's the path (backtrace) through Apache to get to the apr_file_write()? (or is there another bug filed for that issue?)
The problem I was addressing sounds exactly like what is described in issue 20462 with the behavior for me only happening on shutdown, never at startup. It was difficult to troubleshoot because since it happened during shutdown it was difficult to break into before the offending process had died. I had read the information about Apr and knew that it was preferred to not catch things like this in apr but I just wasn't able to troubleshoot the apache httpd code for this issue and hoped that catching the error in Apr and returning an error code might cause the issue in httpd to be more clear. I'll try to back out my patch and test again as soon as I can to see if I can get a stack trace.
Per Jeff's comments, this is a WONTFIX thing: using the file after it's cleaned up is not supposed to work. The last comment in bug 20462 claims that the cause is actually an OpenSSL bug, it would be good to get that corroborated. *** This bug has been marked as a duplicate of 20462 ***