Bug 3532 - Number of Spamd process grow without any control and system stucked
Summary: Number of Spamd process grow without any control and system stucked
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: 2.63
Hardware: HP Linux
: P3 blocker
Target Milestone: 3.1.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-22 09:46 UTC by G.Perricone
Modified: 2004-07-05 11:26 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description G.Perricone 2004-06-22 09:46:04 UTC
The system is a Intel(R) Pentium(R) 4 CPU 3.06GHz with 626,63 MB of mem and 
swap 2,0 GB with Red Hat Enterprise Linux ES release 2.1 running Qmail with 5 
concurrencyincoming process setting, spamassassin-2.63-1, clamav-0.70, qmail-
scanner-1.22. 

During normal operation the number of Spamd process is equal to 1 or two, see 
below...
------------------------- 
  6:28pm  up 12 days,  8:48,  1 user,  load average: 0,35, 0,14, 0,10
73 processes: 71 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  4,0% user,  0,4% system,  0,0% nice, 95,6% idle
Mem:   641664K av,  547336K used,   94328K free,      56K shrd,  331816K buff
Swap: 2096440K av,   12984K used, 2083456K free                   45824K cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
21290 spamd     15   0 89128  86M  2108 S     0,0 13,8   0:04 spamd
21267 qscand    21   0 13356  13M   784 S     0,0  2,0   0:00 clamd
---------------------

But sometimes without any notice number of Spamd process grow without any 
control to 10 or more ,RAM and Swap Mamory was filled and system stucked..

The situation come back to normal after the restart of spamd process

Below: 
- /var/log/message  
Jun 22 01:34:52 smtp kernel: Out of Memory: Killed process 19481 (spamd)

- /var/log/maillog
Jun 22 08:20:06 smtp X-Qmail-Scanner-1.22: [smtp.comune.reggio-
calabria.it108788503347920447] corrupt or unknown Fsecure scanner error or 
memory/resource/perms problem - exit status 1
Jun 22 08:20:09 smtp spamd[9530]: server killed by SIGTERM, shutting down
Jun 22 08:20:33 smtp spamd[19713]: logmsg: server killed by SIGTERM, shutting 
down

- spamassassin options
SPAMDOPTIONS="-x -d -u spamd -H /home/spamd -D"
Comment 1 Matt Kettler 2004-06-22 10:29:56 UTC
If you need to limit the number of processes, look at using the -m parameter.
Without it, you'll get exactly the behavior you describe.

If that does not fix your problem, please follow up. Otherwise I'd suggest the
devs mark the bug INVALID due to misconfiguration.
Comment 2 Malte S. Stretz 2004-06-22 11:15:19 UTC
This will be changed in 3.0 anyway as the you'll have a couple of pre-forked 
children instead of forks-on-demand. There will also be a default for -m, IIRC 
is it 5. 
Comment 3 G.Perricone 2004-06-23 08:33:25 UTC
I have already try to limit the number of processes with the -m parameter, 
without any result...

 
Comment 4 Theo Van Dinter 2004-06-23 09:07:48 UTC
As has been stated, this is done completely differently in 3.0, so closing as WFM.
Comment 5 Carlo Wood 2004-07-01 11:01:21 UTC
You can reopen this bug - it is *definitely* broken in 2.63
and 3.0 is not released yet!

My machine completely halts SEVERAL times a day because it
runs out of memory - this is a very very serious bug.
I'd appreciate a fix and a release of 2.64.

$ ps --forest aux
[...]
nobody    9380  0.2  3.3 29380 4292 ?        S    16:50   0:13 /usr/bin/spamd
--daemonize --max-children 8 --username=nobody
nobody   13552  0.0 10.0 30256 12748 ?       D    18:15   0:05  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13556  0.0  9.8 30256 12504 ?       D    18:15   0:05  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13580  0.0  9.5 30252 12156 ?       D    18:15   0:05  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13588  0.0  9.0 30252 11472 ?       D    18:15   0:05  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13592  0.0  9.4 30252 11964 ?       D    18:15   0:05  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13623  0.0  6.6 30804 8488 ?        D    18:16   0:03  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13635  0.0  7.1 30804 9084 ?        D    18:16   0:03  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13672  0.0  6.6 29736 8484 ?        D    18:16   0:00  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13678  0.0  6.3 29736 8064 ?        D    18:16   0:00  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13763  0.0  5.8 29512 7424 ?        D    18:17   0:00  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13766  0.0  5.8 29512 7448 ?        D    18:17   0:00  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13769  0.0  5.9 29512 7500 ?        D    18:17   0:00  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
nobody   13773  0.0  6.0 29512 7648 ?        D    18:17   0:00  \_
/usr/bin/spamd --daemonize --max-children 8 --username=nob
etc etc (up till 36 processes until my machine isn't even able
anymore to open a new socket).

That is significantly more than 8.

Comment 6 Theo Van Dinter 2004-07-01 11:30:12 UTC
I'm sorry, but there are no future 2.6x releases planned.  All of our efforts have been going into getting 
a 3.0 release finished.
Comment 7 Tom Schulz 2004-07-05 19:26:00 UTC
There is something that you can do to help. Shutdown the automatic Bayes
rebuild and expire and then do these with sa-learn from a cron job. To do
that put the following in your local.cf file:
bayes_auto_expire 0
bayes_journal_max_size 0
and then run 'sa-learn --rebuild' from cron (we do it every 10 minutes) and run
'sa-learn --force-expire' from cron (we do it once a day, off hours).  What I
think is happening (my analysis, I am not a developer) is that after spamd is
done with a message, it may decide to work on the Bayes database.  If it does
that, it is no longer counnted by the -m paramater.  Also, if another spamd
is run before the first one finishes, it too will decide to work on the Bayes
database, and now you start to stack them up.