Bug 284 - better invalid date rule
Summary: better invalid date rule
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P2 normal
Target Milestone: ---
Assignee: Daniel Quinlan
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-05-07 17:45 UTC by Daniel Quinlan
Modified: 2002-05-17 16:29 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
the patch patch None Daniel Quinlan [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Quinlan 2002-05-07 17:45:16 UTC
This patch adds an improved invalid date rule that catches spam with a very
high frequency.  I tested it on unfiltered mail and it caught a lot of bad
messages (plus a few REALLY crappy mailers, but very few).

It seems to be much better than the existing rules.  I removed the rules
that were no longer needed as well.
Comment 1 Daniel Quinlan 2002-05-07 17:46:37 UTC
Created attachment 94 [details]
the patch
Comment 2 Daniel Quinlan 2002-05-07 23:16:59 UTC
Some tests:

For all (spam + non-spam) of the non-mailing list email messages
received at one of my addresses from Jan 2002 to Apr 2002 (1789 total), I
tested the old rules (replaced or removed) and the new rule.

The new INVALID_DATE rule matched 19 messages.  All 19 were spam.

The original INVALID_DATE rule matched 0 messages.

The removed INVALID_DATE_NO_TZ rule matched 9 messages.  The same 9 messages
were also matched by the new INVALID_DATE rule.

The removed INVALID_DATE_ODD_MONTH rule matched 0 messages, so it's fairly
useless and the new rule is more strict about months.

Seems good to me.  :-)
Comment 3 Craig Hughes 2002-05-09 03:21:33 UTC
Corpus says:

bash2.05 craig@balam ~/code/spamassassin % fgrep INVALID masses/freqs 
     13355       13292          63  INVALID_DATE_TZ_ABSURD
      3874        3715         159  INVALID_DATE_NO_TZ
      3793        3677         116  INVALID_MSGID
      1157        1084          73  INVALID_DATE_ODD_MONTH
       115         115           0  INVALID_DATE


I'll try the new rule you've attached and see how it goes.
Comment 4 Craig Hughes 2002-05-09 03:22:27 UTC
Oops -- meant to take it, not accept on behalf of the list
Comment 5 Daniel Quinlan 2002-05-17 16:06:12 UTC
I hope you don't mind if I grab this bug.
Comment 6 Daniel Quinlan 2002-05-18 00:29:13 UTC
Changed proposed rule to be more picky about timezones so AM|PM would not
be considered valid.  Also allow comments and a bit of space at end of line,
but not much else (picks up more spam).