SA Bugzilla – Bug 850
Improve SUBJECT_FREQ
Last modified: 2002-12-18 21:24:43 UTC
The SUBJECT_FREQ rule seems likely to run into a lot of false positives, as I have these matching subjects in my spam corpus: Subject: Earn 36% monthly through fully secured accounts receivable acquisitions Subject: This could be your message viewed by millions daily21296 Subject: Refinance and Reduce Monthly Payments Subject: RE: Monthly Cash Deposited-->$16,468 1620CmYu9--9 Subject: (WSCH.OB) Weekly Hot Stock ATRS Subject: Monthly income $. 2,500 or more!!! Subject: Earn $54,420.00 Monthly For Practically Doing NOTHING Subject: Earn $3 to $5 for each envelope you stuff! $2,000+ weekly. Subject: Sick of the daily grind? Getaway now Subject: Earn $54,420.00 Monthly For Practically Doing NOTHING Subject: re: eBay, MAKE A SERIOUS MONTHLY INCOME 5780 It wouldn't be that hard to do if you could match multiple headers in one header rule (or the same header multiple times); otherwise it would require meta rules, and that just seems a bit messy.
It's just not a particularly good rule. Our methods for compensating newsletters leave much to be desired. OVERALL% SPAM% NONSPAM% S/O SCORE NAME 11424 3726 7698 0.33 0.00 (all messages) 100.000 32.616 67.384 0.33 0.00 (all messages as %) 0.902 0.429 1.130 0.28 -1.92 SUBJECT_FREQ It's hard to believe that we have a rule in there assigning -1.92 to 0.429% of my spam.
From bug #933 (a duplicate) -------------------------------------------------------------------------------- Removed SUBJECT_FREQ from HEAD cvs. hit frequencies: OVERALL% SPAM% NONSPAM% S/O RANK SCORE NAME 0.849 0.585 0.901 0.39 0.33 0.00 SUBJECT_FREQ test code from all files in rules dir: header SUBJECT_FREQ Subject =~ /\b(?:monday|daily|weekly|monthly)\b/i describe SUBJECT_FREQ Subject contains a frequency - probable newsletter tflags SUBJECT_FREQ nice If you want to re-add this test to SpamAssassin, please follow up this bug entry, improving the code until the S/O ratio goes above 0.7 (or below 0.3 for nice tests). (automated submission)
*** Bug 933 has been marked as a duplicate of this bug. ***
not up to scratch -- dropping