Bug 5149 - not recovering from "prefork: select returned -1!"
Summary: not recovering from "prefork: select returned -1!"
Status: RESOLVED DUPLICATE of bug 5313
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: 3.1.4
Hardware: All Linux
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-10-24 17:10 UTC by Kai Bolay
Modified: 2007-03-05 12:18 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
spamd logfile text/plain None Kai Bolay [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Kai Bolay 2006-10-24 17:10:33 UTC
Under heavy load spamd goes into a tailspin every couple of days and goes into a
endless loop reporting

warn: prefork: select returned -1! recovering: Bad file descriptor

I'm running spamd on eight machines: Two under moderate load, three under heavy
load, and three under light load. All run the same version of SpamAssassin (a
slightly outdated version 3.1.4 running on Perl version 5.8.5 and the same
operating system (CentOS release 4.4 with kernel 2.6.9-11.ELsmp).

The machines with heavy load experience this problem every few days. The
machines with moderate load only every couple of weeks, while the machines under
light loads haven't experienced it (yet?).

I'm running a monitoring script which checks the logfile every five minutes and
restarts spamd whenever "warn: prefork: select returned -1! recovering: Bad file
descriptor" shows up more than ten times in a row. That band-aid works OK, but
I'm wondering about the "recovering" in the warning message. Maybe spamd
shouldn't try to recover the way it does currently (it obviously fails), but
instead take more drastic measures (like restarting/reinitializing) itself.

PS: I have the feeling this is related to bug 4590...
Comment 1 Kai Bolay 2006-10-24 17:16:54 UTC
Created attachment 3727 [details]
spamd logfile

this log excerpt (more context in attachment) shows how the problem develops:

[...]
info: prefork: child states: BBIIBIBIBBIBI
info: spamd: handled cleanup of child pid 19135 due to SIGCHLD
warn: Use of uninitialized value in numeric eq (==) at SpamdForkScaling.pm line
662.
warn: Use of uninitialized value in numeric eq (==) at SpamdForkScaling.pm line
662.
warn: Use of uninitialized value in numeric eq (==) at SpamdForkScaling.pm line
662.
warn: Use of uninitialized value in numeric eq (==) at SpamdForkScaling.pm line
662.
warn: Use of uninitialized value in numeric eq (==) at SpamdForkScaling.pm line
662.
info: prefork: child states: BBIIBIBIBBIBS
[...]
warn: prefork: retrying syswrite(): Resource temporarily unavailable at
SpamdForkScaling.pm line 623.
warn: prefork: syswrite(8) to 745 failed on try 2 at SpamdForkScaling.pm line
600.
[...]
warn: prefork: retrying syswrite(): Resource temporarily unavailable at
SpamdForkScaling.pm line 623.
warn: prefork: syswrite(8) to 745 failed on try 3 at SpamdForkScaling.pm line
600.
[...]
warn: prefork: retrying syswrite(): Resource temporarily unavailable at
SpamdForkScaling.pm line 623.
warn: prefork: syswrite(8) to 745 failed on try 4 at SpamdForkScaling.pm line
600.
warn: prefork: giving up at SpamdForkScaling.pm line 602.
warn: prefork: write of ping failed to 745 fd=8: Resource temporarily
unavailable at SpamdForkScaling.pm line 333.
warn: prefork: killing failed child 745 fd=8 at SpamdForkScaling.pm line 127.
warn: prefork: killed child 745 at SpamdForkScaling.pm line 141.
warn: prefork: select returned -1! recovering: Bad file descriptor
info: spamd: handled cleanup of child pid 745 due to SIGCHLD
warn: prefork: select returned -1! recovering: Bad file descriptor
[...]
warn: prefork: select returned -1! recovering: Bad file descriptor
[...]
warn: prefork: select returned -1! recovering: Bad file descriptor
warn: prefork: select returned -1! recovering: Bad file descriptor
warn: prefork: select returned -1! recovering: Bad file descriptor
warn: prefork: select returned -1! recovering: Bad file descriptor
warn: prefork: select returned -1! recovering: Bad file descriptor
warn: prefork: select returned -1! recovering: Bad file descriptor
[...]
Comment 2 Helge Oldach 2006-11-24 01:59:24 UTC
Same issue here, under FreeBSD. Actually it turned up after the commit from bug 
5494 was applied. This very much sounds like this bug was introduced by the fix.

Specific error messages different, but IMHO lead to the same root cause:

Nov 24 02:34:02 merak spamd[72062]: prefork: sysread(7) failed after 300 secs 
at /usr/local/lib/perl5/site_perl/5.8.8/Mail/SpamAssassin/SpamdForkScaling.pm 
line 575.

etc.

Note this is perl 5.8.8. So it doesn't appear perl version specific.
Comment 3 Helge Oldach 2006-11-24 02:01:25 UTC
Sorry that was bug 4594.
Comment 4 Helge Oldach 2006-11-24 02:17:55 UTC
And similar to bug 4594, --round-robin solves the issue.
Comment 5 Daryl C. W. O'Shea 2006-12-08 10:16:54 UTC
(In reply to comment #2)
> Same issue here, under FreeBSD. Actually it turned up after the commit from bug 
> 4594 was applied. This very much sounds like this bug was introduced by the fix.

3.1.4 doesn't have the fix from bug 4594, so no.
Comment 6 Daryl C. W. O'Shea 2007-03-05 12:18:41 UTC
Marking as duplicate of bug 5313 since it's the same thing but with error output
of a newer code version.

*** This bug has been marked as a duplicate of 5313 ***