Bug 2578 - Spamd closes connection too early
Summary: Spamd closes connection too early
Status: RESOLVED INVALID
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: 2.60
Hardware: Other other
: P5 normal
Target Milestone: 2.70
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-10 06:15 UTC by Kurt Huwig
Modified: 2003-10-10 23:05 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
TCPDUMP dump of 2 correct and one failed connection (the last) application/octet-stream None Kurt Huwig [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Kurt Huwig 2003-10-10 06:15:37 UTC
I'm running

||/ Name           Version        Description
+++-==============-==============-============================================
ii  spamassassin   2.60-1         Perl-based spam filter using text analysis

on Debian 3.0. We do not use spamc, but our own software to access spamd using
the spamc protocol. Unfortunately, spamd seems to close the connection too
early. About one out of 3-5 connection is terminated too early. I did an tcpdump
of the connections, that I will attach to this bug. It shows that the 'normal'
connections answer with:

SPAMD/1.1 0 EX_OK
Spam: True ; 15.0 / 5.0

While the failing connections only do

SPAMD/1.1 0 EX_OK

and then close the connection. Both lines are sent in separate TCP/IP-packages,
probably due to 'client->autoflush(1)'. I had a look at the source, but did not
find where the connection is closed. IMHO, this looks like a race condition
where our code is simply to slow to read fast enough, so that the close is done
before the rest of the buffer has been sent.
Comment 1 Kurt Huwig 2003-10-10 06:20:09 UTC
Created attachment 1477 [details]
TCPDUMP dump of 2 correct and one failed connection (the last)

Open this with ethereal, select the first package and do 'Tools - Follow TCP
Stream'. This is what it should look like (scroll to the end to see the
response of spamd).

Now close the window with the stream and click on 'Reset' in the status bar of
ethereal to see all packages again. Then select the last package (time 1548.xx)
and follow TCP Stream again. This connection failed, closing the connection
before sending the rating.
Comment 2 Kurt Huwig 2003-10-10 07:17:50 UTC
I just changed the socket-close code to:

    if (defined $client) {
        logmsg "socket was open";
        $client->close; }      # paranoia vs fd leaks

Which gives this log:

Oct 10 16:02:48 mail spamd[10733]: connection from mail.rk4.huwig.de
[212.88.142.131] at port 51491
Oct 10 16:02:48 mail spamd[10733]: socket was open
Oct 10 16:02:48 mail spamd[11626]: Still running as root: user not specified
with -u, not found, or set to root.  Fall back to nobody.
Oct 10 16:02:48 mail spamd[11626]: checking message
<1065793730.2516.15.camel@localhost.localdomain> for (unknown):65534.
Oct 10 16:02:48 mail spamd[11626]: clean message (0.0/5.0) for (unknown):65534
in 0.0 seconds, 3158 bytes.

As you can see, $client->close() is called before the 'clean message' has been
sent. I just moved the $client->close() from there to the 'check' method just
after loggin to syslog. Maybe this helps.
Comment 3 Fred T 2003-10-10 07:30:51 UTC
Could this be a problem with the server running spamD?  I've re-created a 
program using the spamD protocol and I have never seen any issues like this.  
It might also be an issue with your program?

My Source code here:
http://sf.net/projects/winspamc

Can you reproduce this issue with a certian message or only while under a heavy 
load?  Do you set maxchildren of SpamD?  Which type of reporting are you using, 
ie:
CHECK
SYMBOLS
REPORT
REPORT_IFSPAM
PROCESS

Have you run spamD in debug mode?  Do you see anything strange while doing that?

I'm sure the real dev team will have more to ask, but this information would be 
helpful.
Comment 4 Jason Brunette 2003-10-10 07:39:38 UTC
I'm not sure if this is related or not, but ever since we moved our spamd server 
to a remote machine, spamc occasionally logs this to syslog:

spamc[10677]: failed sanity check, 5504 bytes claimed, -1 bytes seen

Reviewing the spamc code, this could only happen if spamc received the "SPAMD" 
header (including Content-Length, etc) but was unable to retrieve the scanned 
message (using "process" method, not "check").  I'm guessing that read() called 
on a socket that was closed by the remote end would return -1, correct?

I looked over the logs (a years worth) of running spamd on the same machine as 
spamc and I never saw the "failed sanity check" error.  The instant we moved 
spamd to a different server, we started getting these errors.

spamc machine is RH 7.2, spamd is RH 9.0.
Comment 5 Kurt Huwig 2003-10-10 08:21:42 UTC
I guess this is a timing issue, so it may be hardware dependent, as well as
client dependent. The problem appears on light load. There is about one message
every 5-10 minutes. The machine does other things, but

 17:16:30 up 4 days, 21 min,  1 user,  load average: 0.04, 0.04, 0.00

most of the time it is idle. I am using CHECK, as you can see from the tcpdump
output. I just activated debug mode and wait for another occurence.

> spamc[10677]: failed sanity check, 5504 bytes claimed, -1 bytes seen

-1 means (man 2 read):

"On error, -1 is returned"

This looks like the error I have:

java.net.SocketException: Connection reset
        at java.net.SocketInputStream.read(SocketInputStream.java:168)
        at java.net.SocketInputStream.read(SocketInputStream.java:182)

The tcpdump shows that there is a rst indeed.
Comment 6 Kurt Huwig 2003-10-10 08:45:04 UTC
OK, I just found a bug in my software: I did not read the output from spamd till
the end, therefore the RST of the connection. I changed this to something that
reads till the end. Moving the $client->close() has proven to be fatal for this
change, so I switched back to original 2.60.

If this was the reason for this, I would like to ask why the code fails if I do
not read all bytes sent to me. It seems like the RST of on connection affects
the other connection. If this is true, this might be used for some DoS.
Comment 7 Kurt Huwig 2003-10-10 09:39:52 UTC
I think I found it. My original code did not send enough bytes some time ago. So
I "fixed" it by adding a CRLF to each mail. After correcting this bug, the two
bytes were still there. This is what I think had happened:

I send 'Content-length' + 2 bytes to spamd. The 2 addition bytes came in a
separate tcp-package. Spamd receives 'Content-length' bytes and tries to send
the answer. The answer consists of two lines, each sent within a separate
tcp-package. That happened was, that sometimes, the 2 additional bytes were
received, after the  '$client->close()' happened, but before the second line has
been sent. That means that the socket was in the state 'I do not want any
additional data' but the second line was still in the send buffer. The two bytes
then causes a RST, taking down the connection and apparently discarding the
second line.

Lesson learned: do not send more than content-length bytes!

I'll let the new version run for some time. If everything works fine, I'll close
this bug.

Thanks for your help and the good product!
Comment 8 Kurt Huwig 2003-10-11 07:05:21 UTC
It seems to work now.