SA Bugzilla – Bug 3915
spamassassin sometimes skips to do bayes test
Last modified: 2005-05-24 15:14:10 UTC
I'm using spamassassin in .procmailrc: :0fw: .spamassassin.lock | /usr/bin/spamassassin :0: * ^X-Spam-Status: Yes mail/spamfolder I'm using sa-learn, so bayes test are quite important. It generally works. But sometimes junk break through to inbox, and most of that false-negatives has no BAYES_XX test. So sometimes spamassassin doesn't check email for bayes probability. for 200 spams/day it gives 10 junk in inbox, and half of them meets problem described above.
additional information: it seems only happend if given message is spam, and isn't classified by spamassassin as spam (has less then 5 points). if it's normal message, ie. from my friend, it has ie. BAYES_00 tag. but if it's spam, it sometimes has no BAYES_XX at all. some of messages classified as spam hasn't BAYES either.
not having a bayes hit is not necessarily a bug. you need to run the message(s) through spamassassin in debug mode (-D) and see what's coming up. it could just be there aren't enough tokens in the message that are also in your DB, so bayes disables since there's no way to give a probability.
Problem is, that I cannot reproduce bug by hand. The same message, which arrives without BAYES test, processed again behaves properly. So cause is unknown for me now, seems to be random. Due to above, I don't think it's related to not enough tokens.
The message isn't being auto-learned the first time through, is it? This would explain bayes assigning a probability the second time through, since it would then have enough tokens.
hmm, maybe. I have cleaned my spamfolder, so I can't say if you're right or wrong. I'll wait for such accident and check if emails without BAYES has no equivalents which came earlier and was classified properly.
Created attachment 2482 [details] first arrive, bayes tested
Created attachment 2483 [details] second arrived, bayes not tested
look at the attachments I made. They are almost identical, but first was tested by bayes, second wasn't tested. So I think it isn't true that bayes hasn't enough tokens.
The first message wasn't learned since it scored between the learning thresholds. The tokens matched in your database must have been from the headers which are vastly different from the second message. The examples don't show anything wrong.
Subject: Re: spamassassin sometimes skips to do bayes test On Mon, Oct 25, 2004 at 05:06:29AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > The examples don't show anything wrong. As usual, if there was debug output, we'd know what was going on.
ok, I realized how to turn on debugging and use spamassassin in the same time. I'll show if there will be something interesting. ad comment #9 > The tokens matched in your database must have been from the headers which are > vastly different from the second message. > The examples don't show anything wrong. I don't know why different headers are important here. I thought that bayes test examine whole body of message, not only headers (at least #1 at http://wiki.apache.org/spamassassin/BayesInSpamAssassin seems to say something like that). Body of both messages is in mostly identical. First message was classified by bayes (so there was tokens in db), second not (should use tokens used in first message).
> I don't know why different headers are important here. ... Yes, Bayes tests the whole body, headers and body text. And it considers all of that. If in your first message the headers and body together indicated spam, but in the second message with similar body, but very different headers, the body suggested spam and headers suggested not-spam, then Bayes wouldn't make a determination. Anyway, in October you were going to see if you could get a full debug output which would let us know whether Bayes was working and simply not able to make a determination on the "skipped" messages, or whether there was an actual problem. Were you able to do this, and make a determination? Is there a problem that still needs to be pursued?
Triage: Closing as WORKSFORME only because there's been no debug output to work with, as offered in October and requested again last month. If someone can reproduce this and provide debug output, please reopen.