Bug 26265

Summary: [emerg] (28)No space left on device: Couldn't create accept lock
Product: Apache httpd-2 Reporter: Kevin Hawkins <jedihawk>
Component: CoreAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: CLOSED INVALID    
Severity: normal CC: atsharma9
Priority: P3    
Version: 2.0.48   
Target Milestone: ---   
Hardware: PC   
OS: Linux   

Description Kevin Hawkins 2004-01-20 00:03:30 UTC
I just compiled 2.0.48 with mod_proxy included.  I have two httpd's already
running, listening to different ports, and all config files test Syntax OK.  The
first two are running fine:

# cat /usr/local/apache2/logs/httpd.hawk.pid
4157
# ps ax | grep 'httpd' | grep 'hawk'
 4157 ?        S      0:00 bin/httpd -f conf/httpd.conf.hawk
 4158 ?        S      0:00 bin/httpd -f conf/httpd.conf.hawk
 4159 ?        S      0:00 bin/httpd -f conf/httpd.conf.hawk
 4160 ?        S      0:00 bin/httpd -f conf/httpd.conf.hawk
 4191 ?        S      0:00 bin/httpd -f conf/httpd.conf.hawk
 4192 ?        S      0:00 bin/httpd -f conf/httpd.conf.hawk

# cat /usr/local/apache2/logs/httpd.immuno.pid 
4143
# ps ax | grep 'httpd' | grep 'immuno'
 4143 ?        S      0:00 bin/httpd -f conf/httpd.conf.immuno
 4144 ?        S      0:00 bin/httpd -f conf/httpd.conf.immuno
 4249 ?        S      0:00 bin/httpd -f conf/httpd.conf.immuno

Now trying to run a third so that I can proxypass to it from the first one.
When I run httpd with my custom config file, I do not get any errors on the
commmand-line, but this shows up in the error log:

[Mon Jan 19 15:19:22 2004] [emerg] (28)No space left on device: Couldn't create
accept lock

Apache did create the lock file (which is specified in the config file and is on
a local filesystem), and it's exactly the same as the other two:

# dir /usr/local/apache2/logs/
total 24
drwxr-sr-x    2 root     root         4096 Jan 19 15:19 ./
drwxr-sr-x   15 root     root         4096 Jan 19 14:00 ../
-rw-r--r--    1 root     root            5 Jan 19 14:59 httpd.hawk.pid
-rw-r--r--    1 root     root            5 Jan 19 14:58 httpd.immuno.pid
-rw-r--r--    1 root     root            5 Jan 19 15:19 httpd.wss.pid

But this instance of Apache is not running:

# cat /usr/local/apache2/logs/httpd.wss.pid    
4396
# ps ax | grep 'httpd' | grep 'wss'

This problem may be in the mod_proxy component; I'm no expert in this.

I've plenty of space in this filesystem:
# df -h .
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda10             37G   27G  9.0G  75% /

My kernel:
# uname -a
Linux badass.jedihawk.net 2.4.16 #1 SMP Tue Feb 19 09:53:17 PST 2002 i686 unknown
# uptime
  3:51pm  up 477 days,  6:39, 34 users,  load average: 0.14, 0.12, 0.21

Hope this helps.
Please email me if I've left out anything: jedihawk@mail.com
Comment 1 André Malo 2004-01-28 18:53:15 UTC
The default accept mutex mechanism on linux is sysvsem. Seems you're running out
of semaphores.
You should consider to use another mechanism, see AcceptMutex directive.

I'm marking it invalid for now. Feel free to reopen the report if there are
further issues.
Comment 2 Martin Mokrejs 2005-07-03 03:15:01 UTC
I had similar problem on Gentoo linux right now. I was facing weird problems
with mod_python last weeks. Time to time mod_python runs in 'restricted mode'.
Once I have realized the poblem appears fortunately reproducibly with some tiny
test code I had. The code was in python and was using cElementTree module.

The error generated by python was:

IOError: file() constructor not accessible in restricted mode 

I've emailed Fredrik Lundh, and ofr the sake of the archive I'm attaching what
I wrote. I was using apache-2.0.54-r7 (and 2.0.54-r11), mod_python-3.1.3-r1 (and
3.1.4), celementtree-1.0 (and celementtree-1.0.2), elementtree-1.2:

Hi Fredrik,
 some definitely the problem is with cElementTree. When I use
ElementTree isntead, the problem goes away. It manifests with
apache holding lots of semaphores and I believe why I saw using
strace one of the apache processes not willing to exit on apache
shutdown doing this:

Process 6424 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 74000}) = 0 (Timeout)
waitpid(-1, 0xbffffb30, WNOHANG|WSTOPPED) = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
waitpid(-1, 0xbffffb30, WNOHANG|WSTOPPED) = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)
waitpid(-1, 0xbffffb30, WNOHANG|WSTOPPED) = 0
select(0, NULL, NULL, NULL, {1, 0})     = 0 (Timeout)


I recompiled apache to make sure the problem is not in there
and turned on debug log, but I couldn't restart apache anylonger.
Actually, I could not shut it down. I killed the threads with -9
but I couldn't start it either, and found these errors:

[Sun Jul 03 00:47:09 2005] [info] mod_unique_id: using ip addr 195.113.57.20
[Sun Jul 03 00:47:10 2005] [info] Init: Initializing OpenSSL library
[Sun Jul 03 00:47:10 2005] [info] Init: Seeding PRNG with 512 bytes of entropy
[Sun Jul 03 00:47:10 2005] [info] Init: Generating temporary RSA private keys
(512/1024 bits)
[Sun Jul 03 00:47:10 2005] [info] Init: Generating temporary DH parameters
(512/1024 bits)
[Sun Jul 03 00:47:10 2005] [info] Init: Initializing (virtual) servers for SSL
[Sun Jul 03 00:47:10 2005] [info] Server: Apache/2.0.54, Interface:
mod_ssl/2.0.54, Library: OpenSSL/0.9.7g
[Sun Jul 03 00:47:10 2005] [info] mod_unique_id: using ip addr 195.113.57.20
[Sun Jul 03 00:47:11 2005] [notice] mod_python: Creating 32 session mutexes
based on 16 max processes and 2 max threads.
[Sun Jul 03 00:47:11 2005] [error] (28)No space left on device: mod_python:
Failed to create global mutex 0 of 32 (/tmp/mpmtx293580).
Configuration Failed

The machine had gigabytes of diskspace so that wasn't the problem.

Luckily I found this page http://www2.goldfisch.at/knowledge/224
so I've managed to kill those semaphores hanging around without machine
reboot.

I could start apache again and replace all cElementTree uses with
ElementTree in my program. And then it was clear. Just to be sure
I've compiled/installed cElementTree-1.0.2 but that did not help,
nor downgrade back to cElementTree-1.0. Based on that "goldfisch"
webpage I think problem might be some memleak causing apache being
the parent process not to exit? Is this scenario possible?
Martin