SA Bugzilla – Bug 4455
--reuse should only reuse when X-Spam-Status present
Last modified: 2005-07-05 13:03:12 UTC
Currently, --reuse disables all reusable rules by setting the score to 0. Instead, it should attempt to run those tests when the message has no X-Spam-Status header This will skew results for mass-checks, but hopefully not enough to invalidate them for 3.1.0.
If people running mass-checks have large sections of mail without X-Spam-Status headers, perhaps that should be run without --reuse. This is non-trivial. Of course, it's much much easier to make sure your corpus has real-time X-Spam-Status headers.
My corpus is a mixture of both. There's the mail that comes into my "good" addresses, get's run through SA and sorted based on the score, then there's the "bad" addresses, which have never been legitimately used and get forwarded into my corpus without touching SA, thus saving my poor stuttering mail server some cycles. Of the 17,450 spams in my June '05 folder, 9,269 have X-Spam-Status. Which skew is "better"? Running against the current DNSBL databases, or using the "real time" values, but ignoring the hits for almost half my spam corpus?
*** This bug has been marked as a duplicate of 4461 ***