SA Bugzilla – Bug 284
better invalid date rule
Last modified: 2002-05-17 16:29:13 UTC
This patch adds an improved invalid date rule that catches spam with a very high frequency. I tested it on unfiltered mail and it caught a lot of bad messages (plus a few REALLY crappy mailers, but very few). It seems to be much better than the existing rules. I removed the rules that were no longer needed as well.
Created attachment 94 [details] the patch
Some tests: For all (spam + non-spam) of the non-mailing list email messages received at one of my addresses from Jan 2002 to Apr 2002 (1789 total), I tested the old rules (replaced or removed) and the new rule. The new INVALID_DATE rule matched 19 messages. All 19 were spam. The original INVALID_DATE rule matched 0 messages. The removed INVALID_DATE_NO_TZ rule matched 9 messages. The same 9 messages were also matched by the new INVALID_DATE rule. The removed INVALID_DATE_ODD_MONTH rule matched 0 messages, so it's fairly useless and the new rule is more strict about months. Seems good to me. :-)
Corpus says: bash2.05 craig@balam ~/code/spamassassin % fgrep INVALID masses/freqs 13355 13292 63 INVALID_DATE_TZ_ABSURD 3874 3715 159 INVALID_DATE_NO_TZ 3793 3677 116 INVALID_MSGID 1157 1084 73 INVALID_DATE_ODD_MONTH 115 115 0 INVALID_DATE I'll try the new rule you've attached and see how it goes.
Oops -- meant to take it, not accept on behalf of the list
I hope you don't mind if I grab this bug.
Changed proposed rule to be more picky about timezones so AM|PM would not be considered valid. Also allow comments and a bit of space at end of line, but not much else (picks up more spam).