Bug 5800 - RFE: rule DATE_CONTAINS_TAB (rule included)
Summary: RFE: rule DATE_CONTAINS_TAB (rule included)
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 3.2.4
Hardware: Other other
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-26 21:14 UTC by Karsten Bräckelmann
Modified: 2008-02-21 15:12 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Karsten Bräckelmann 2008-01-26 21:14:03 UTC
Reminded and encouraged by the recent mailing list thread about
DOS_OUTLOOK_TO_MX, I want to propose a new rule.

I have posted this rule a couple weeks ago to the mailing list, and it has been
hitting like crazy on my particular spam ever since. Maybe I'm just lucky to
being tortured by a specific spammer... ;)

First, some interesting observations:

* DATE_CONTAINS_TAB almost exclusively comes with a faked The Bat! MUA header.

* The recent home.graffiti.net URI abuse flood *exclusively* does use both, a
  faked The Bat! MUA header and the Date header containing a tab.


The basic rule is quite simple, and probably should apply to a lot of headers. I
have not seen it for other headers than Date, though. Granted, did not have a
close look yet.

header   KB_DATE_CONTAINS_TAB   Date:raw =~ /^ \t/
describe KB_DATE_CONTAINS_TAB   Header: Date header starts with Tab


Within the last 2 weeks, this hit on 25% of my 05-10 spam, 42% for 10-15. It's
about 15% for higher scoring spam. (Note: These results include some special,
custom crafted rules which apply to my env only.) This definitely hits on the
sneaky, low scorers for me.


I got additional rules, to score extra points if the MUA is faked to be The
Bat!, and I have been told by a user of that MUA, that it never ever generated
such headers for him. I have not contacted the authors to verify, though.

Also, I got an additional rule to flame spam with a graffiti.net URI and such
headers. But that probably is just a temporary issue. I still do hope they will
stop that abuse...
Comment 1 Karsten Bräckelmann 2008-01-27 20:50:23 UTC
Adjusting subject. Yes, the rule is included. ;)
Comment 2 Karsten Bräckelmann 2008-02-03 17:41:38 UTC
Adjusting Severity, since this actually seems to be commonly used for new rules.
Comment 3 Karsten Bräckelmann 2008-02-16 17:55:26 UTC
*ping*

Anyone interested?  It's a simple rule that hits on a lot of spam for me, and
actually never should trigger on any ham at all.
Comment 4 Theo Van Dinter 2008-02-16 21:36:08 UTC
fyi, after removing the space (just /^\t/) I get:

  3.957   4.2692   0.0000    1.000   0.33    1.00  KB_DATE_CONTAINS_TAB
Comment 5 Karsten Bräckelmann 2008-02-17 04:01:32 UTC
Thanks, Theo.  So at least, it doesn't hit any ham. ;)

No, seriously, this is weird. Actually, that has been my first approach too, as
I figured SA would remove the single delimiting space before matching. It does
for other headers. I just checked again some recent spam with a few variants.
Neither  Date:raw =~ /^\t/  nor  Date =~ /^\t/  does match for me. The Date:raw
rule with the space as mentioned in comment 0 does, though...

And yes, I just verified again by grepping through the headers. These headers
indeed are exactly /^Date: \t/.  Stumped.


(Oh, and it still hits a magnitude higher on my incoming stream. Maybe I'm just
lucky, and some particular spammer loves me... *shrug*)
Comment 6 Loren Wilton 2008-02-17 11:27:13 UTC
I believe there was a recent bugfix to remove the leading space that was 
showing up in header rules.
Comment 7 Karsten Bräckelmann 2008-02-20 11:15:45 UTC
I just noticed, Justin already added KB_DATE_CONTAINS_TAB and KB_FAKED_THE_BAT
to jm/20_basic.cf based on my earlier post to the list.  Nice. :)

However, the ruleqa results are quite unsatisfying. I believe the reason to be
the recent change WRT leading whitespace, as mentioned by Loren and Theo. Still,
it hits at least some spam -- due to a not uptodate mass-check env somewhere?

With the whitespace fix in place (assuming it only strips the first space,
rather than all leading whitespace), the adjusted rule Theo mentioned should hit
way better. Justin, can you please tweak this?

Also, I believe __THEBAT_MUA in 20_ratware.cf should better be anchored at the
beginning.


Oh, and is there any way to get ahold of the 2 *hams* with a tab in the Date
header, but at least not sent by The Bat? Ths strikes me as odd...
Comment 8 Justin Mason 2008-02-21 07:49:14 UTC
(In reply to comment #7)
> I just noticed, Justin already added KB_DATE_CONTAINS_TAB and KB_FAKED_THE_BAT
> to jm/20_basic.cf based on my earlier post to the list.  Nice. :)
> 
> However, the ruleqa results are quite unsatisfying. I believe the reason to be
> the recent change WRT leading whitespace, as mentioned by Loren and Theo. Still,
> it hits at least some spam -- due to a not uptodate mass-check env somewhere?

Daryl, you may want to check this, it appears to be your corpus.

> With the whitespace fix in place (assuming it only strips the first space,
> rather than all leading whitespace), the adjusted rule Theo mentioned should hit
> way better. Justin, can you please tweak this?

done:

: jm 163...; svn commit -m "bug 5800: fix bug in KB_DATE_CONTAINS_TAB rule"
rulesrc/sandbox/jm/20_basic.cf
Sending        rulesrc/sandbox/jm/20_basic.cf
Transmitting file data .
Committed revision 629833.

marking fixed, since this is now in.

> Also, I believe __THEBAT_MUA in 20_ratware.cf should better be anchored at the
> beginning.

yep, well spotted. r629836.

> Oh, and is there any way to get ahold of the 2 *hams* with a tab in the Date
> header, but at least not sent by The Bat? Ths strikes me as odd...

Daryl -- these are yours again:

.  1
/home/dos/SA-corpus/ham/dos/Inbox-2007/1175734785.M236771P31456V0000000000000302I001C078D_78.cyan.dostech.net,S=18894:2,S
KB_DATE_CONTAINS_TAB,UPPERCASE_50_75,__CT,__CTYPE_HAS_BOUNDARY,__DOS_RCVD_MON,__DOS_RELAYED_EXT,__ENV_AND_HDR_FROM_MATCH,__HAS_MSGID,__HAS_RCVD,__HAS_SUBJECT,__HAS_X_MAILER,__INR_AND_NO_REF,__LAST_UNTRUSTED_RELAY_NO_AUTH,__MIME_BASE64,__MIME_VERSION,__MIME_VERSION_APPLEMAIL,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__MSGID_APPLEMAIL,__MSGID_OK_HOST,__MSOE_MID_WRONG_CASE,__NAKED_TO,__NONEMPTY_BODY,__NUMBERS_IN_SUBJ,__PART_STOCK_CD_F,__RELAY_MUA_HELO_IP_OR_NONE,__SANE_MSGID,__TOCC_EXISTS,__TVD_BODY,__TVD_MIME_ATT,__TVD_MIME_ATT_AP,__TVD_MIME_ATT_TP,__TVD_MIME_CT_MM,__UPPERCASE_50_75,__USER_AGENT_APPLEMAIL,__XM_APPLEMAIL,__X_MAILER_APPLEMAIL
time=1175545146,scantime=0,format=f,reuse=yes,set=1,host=injector.georgianbayplastics.com
.  1
/home/dos/SA-corpus/ham/dos/infra-list/1196034235.M76328P8771V0000000000000302I008D0ED5_3.cyan.dostech.net,S=2822:2,S
KB_DATE_CONTAINS_TAB,T_RP_MATCHES_RCVD,T_SIDNEY__GATED_THROUGH_RCVD_REMOVER,T_SIDNEY__LYRIS_EZLM_REMAILER,T_SIDNEY__UNUSABLE_MSGID,__CD,__CT,__CTE,__CT_TEXT_PLAIN,__DOS_HAS_ANY_URI,__DOS_HAS_LIST_ID,__DOS_HAS_LIST_UNSUB,__DOS_HAS_MAILING_LIST,__DOS_RCVD_SUN,__DOS_RELAYED_EXT,__DOS_SINGLE_EXT_RELAY,__FH_HAS_XPRIORITY,__GATED_THROUGH_RCVD_REMOVER,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_MSGID,__HAS_RCVD,__HAS_SUBJECT,__HAS_X_MAILER,__LAST_UNTRUSTED_RELAY_NO_AUTH,__LOCAL_PP_NONPPURL,__LYRIS_EZLM_REMAILER,__MIME_VERSION,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__MSOE_MID_WRONG_CASE,__NAKED_TO,__NONEMPTY_BODY,__SANE_MSGID,__TOCC_EXISTS,__TVD_BODY,__TVD_MIME_ATT_TP,__UNUSABLE_MSGID
time=1196026181,scantime=0,format=f,reuse=no,set=0,host=pe840-c2

Comment 9 Daryl C. W. O'Shea 2008-02-21 15:12:55 UTC
(In reply to comment #8)
> Daryl, you may want to check this, it appears to be your corpus.

>
/home/dos/SA-corpus/ham/dos/Inbox-2007/1175734785.M236771P31456V0000000000000302I001C078D_78.cyan.dostech.net,S=18894:2,S

Ham (likely auto generated):

Mime-Version: 1.0 (Apple Message framework v752.2)
X-Mailer: Apple Mail (2.752.2)

>
/home/dos/SA-corpus/ham/dos/infra-list/1196034235.M76328P8771V0000000000000302I008D0ED5_3.cyan.dostech.net,S=2822:2,S

Ham (onet.pl home grown webmail, I think):

X-Mailer: onet.poczta