SA Bugzilla – Bug 1106
rules to detect forged MUAs
Last modified: 2002-12-18 21:27:37 UTC
The rule identifying Outlook Express as a non-spam MUA has a positive score. This is because the header is frequently forged. OE (and Outlook) have a readily recognised Message-Id format, which many spams claiming to have been sent with OE don't use. I suggest a new meta rule to try to identify these. __HAS_OUTLOOK_IN_MAILER already exists, I've added __MSGID_MS_FORMAT and FAKED_MS_MUA rules as shown below to my local.cf: header __MSGID_MS_FORMAT Message-Id =~ /^<[0-9a-f]{12,12}\$[0-9a-f]{8,8}\ $[0-9a-f]{8,8}\@.{1,50}>$/ describe __MSGID_MS_FORMAT Message-Id is in standard Microsoft format meta FAKED_MS_MUA (__HAS_OUTLOOK_IN_MAILER && !__MSGID_MS_FORMAT) describe FAKED_MS_MUA Mailer claims to be Outlook/OE, but Message-Id is in wro ng format score FAKED_MS_MUA 1.0 Obviously a new run of the scoring system would be required.
(assigning to me) Thanks. Seems like a good test to try. I had to make a few changes so far (also please don't put extra newlines into submissions, if you have trouble with cut-and-paste due to your browser, attachments are a good idea). Here's the revised version (it just exempts Outlook IMO which has a different header format). header __OUTLOOK_EXCEPT_IMO X-Mailer =~ /Microsoft Outlook(?! IMO)/ header __OUTLOOK_MSGID Message-Id =~ /^<[0-9a-f]{12,12}\$[0-9a-f]{8,8}\$[0-9a-f]{8,8}\@.{1,50}>$/ meta T_FORGED_OUTLOOK_MAILER (__OUTLOOK_EXCEPT_IMO && !__OUTLOOK_MSGID) describe T_FORGED_OUTLOOK_MAILER Forged mail pretending to be from Outlook/OE score T_FORGED_OUTLOOK_MAILER 1.0 It works well OVERALL% SPAM% NONSPAM% S/O RANK SCORE NAME 12402 4708 7694 0.38 0.00 0.00 (all messages) 100.000 37.962 62.038 0.38 0.00 0.00 (all messages as %) 3.822 9.919 0.091 0.99 0.62 1.00 T_FORGED_OUTLOOK_MAILER except for the troublesome false positives: X-Mailer: Microsoft Outlook Express 6.00.2600.0000 Message-ID: <OE55z9IPUS9O4Tsvthl00001e4d@hotmail.com> Message-ID: <20285A942B45D5118B1400A0D2A4615502858D@NTSERVER> X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) Message-ID: <20285A942B45D5118B1400A0D2A4615502859C@NTSERVER> X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) Message-ID: <20285A942B45D5118B1400A0D2A461550285A3@NTSERVER> X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) X-Mailer: Microsoft Outlook Express 6.00.2600.0000 Message-ID: <OE692nnNf0GjFbznl740001024a@hotmail.com> X-Mailer: Microsoft Outlook Express 6.00.2600.0000 Message-ID: <OE46gLzhVOJwnzsLqIf00000495@hotmail.com> X-Mailer: Microsoft Outlook Express 6.00.2600.0000 Message-ID: <OE2143oFLaqSOc7wo9500001163@hotmail.com> If I exclude those X-Mailers, my SPAM% goes from 9.919% to 6.096% so I'd rather not exclude them, but this test might work a lot worse for other people.
I forgot to note, it's now in CVS for testing. So my cut-and-paste was for information, not actual use. ;-)
Here are more rules to test! The FPs for T_FORGED_MUA_IMS are all from the same person and site and even though the MUA version is the same as other non-FPs, it has a very different Message-ID format, so I suspect that he or his site is munging outgoing IDs which would make it usable. OVERALL% SPAM% NONSPAM% S/O RANK SCORE NAME 12402 4708 7694 0.38 0.00 0.00 (all messages) 100.000 37.962 62.038 0.38 0.00 0.00 (all messages as %) 1.193 3.144 0.000 1.00 0.86 1.00 T_FORGED_MUA_MOZILLA 0.839 2.209 0.000 1.00 0.84 1.00 T_FORGED_MUA_OIMO 0.782 2.060 0.000 1.00 0.83 1.00 T_FORGED_MUA_AOL 0.718 1.890 0.000 1.00 0.83 1.00 T_FORGED_MUA_EUDORA 3.895 10.068 0.117 0.99 0.60 1.00 T_FORGED_MUA_OUTLOOK 0.750 1.784 0.117 0.94 0.49 1.00 T_FORGED_MUA_IMS All in CVS now.
Your rule for Outlook IMO is too strict (or it would be if the . in [a-z.] were quoted :-) Would /^<[A-P]{28}\.[a-zA-Z_\.]+\@\S+>$/ be better (you can definitely have capitals and underscore here)? Also, I note that you're using \S+ as the right-hand side of the message ID. Someone else pointed out privately that my \@.{1,50} looked odd (for __OUTLOOK_MSGID) - should that be changed to \S+ too?
Created attachment 401 [details] Message-ID format rules for Mutt and The Bat!
Above attachment contains a set of rules for The Bat! and Mutt. I've seen a fair number of spams claiming to be from The Bat! Mutt is for completeness. It possibly isn't worth doing other MUAs at the moment, since most of them don't seem to have any spam filed against them (at least, not according to STATISTICS.TXT).
Subject: Re: [SAdev] rules to detect forged MUAs martin-sabz@zamenhof.demon.co.uk writes: > Above attachment contains a set of rules for The Bat! and Mutt. > > I've seen a fair number of spams claiming to be from The Bat! Mutt is for > completeness. It possibly isn't worth doing other MUAs at the moment, since > most of them don't seem to have any spam filed against them (at least, not > according to STATISTICS.TXT). Mutt is never (or almost never) forged (like my mailer, Emacs/VM) so I highly doubt it will be worth running as a rule. I think the risk of Mutt changing their Message-ID format outweighs the potential benefit so I didn't add it to CVS (all I got was FPs anyway). A lot of the spam claiming to come from The Bat! is actually sent from The Bat! It's frequently used as spamware, even though it is used by non-spammers. However, it does look like a rule might work out well. I found some older versions that use a slightly different Message-ID, though, so I simplified the rule a bit and added it to CVS. Also, for the date strings starting with "200", I think it's fine to just use \d and a length (or length range) like the other rules. If a spammer manages to mimic that much, then they're going to notice it's a date and mimic that as well. Dan
Subject: Re: [SAdev] rules to detect forged MUAs bugzilla-daemon@hughes-family.org writes: > Your rule for Outlook IMO is too strict (or it would be if the . in > [a-z.] were quoted :-) You don't need to quote . inside of a [] set. > Would /^<[A-P]{28}\.[a-zA-Z_\.]+\@\S+>$/ be better (you can > definitely have capitals and underscore here)? I found underscore, but do capitals happen? Do you have non-spam examples you could attach? I'll add capitals for now. > Also, I note that you're using \S+ as the right-hand side of the message ID. > Someone else pointed out privately that my \@.{1,50} looked odd (for > __OUTLOOK_MSGID) - should that be changed to \S+ too? Changed in CVS. I used \S+ to keep things simple for now. We can change some of them to be more specific if it would significantly raise SPAM% without any FPs. Dan
in CVS