Bug 656 - Testing FROM addresses for spam
Summary: Testing FROM addresses for spam
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P2 enhancement
Target Milestone: ---
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-08-04 15:29 UTC by Marc Perkel
Modified: 2002-12-18 03:33 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Perkel 2002-08-04 15:29:46 UTC
I think I stumbled onto a new frontier. I got the idea of checking the FROM
address because I noticed that the word OFFER(S) was often used in the address
and what a score! Got 1149 hits and only one FP and that one might be a spam in
my non-spam corpus.

Here's the rule - you try it.

header FROM_OFFER		From =~ /offer/i
describe FROM_OFFER		From address contains OFFER

   OVERALL        SPAM     NONSPAM  NAME
     14425        9610        4815  (all messages)
      1150        1149           1  FROM_OFFER
Comment 1 Daniel Quinlan 2002-08-04 17:38:25 UTC
I'm not so sure about this.  I'm sure the next one to be suggested will be
"free".  Great, except when you're working with tons of people from
freestandards.org or something like that.  This type of rule seems like it
could cause a lot of false matches.

fs = spam froms
fg = nonspam froms

$ egrep -ic offer /tmp/{fs,fg}
/tmp/fs:29
/tmp/fg:0
$ egrep -ic free /tmp/{fs,fg}
/tmp/fs:58
/tmp/fg:269
$ egrep -ic freestandards /tmp/{fs,fg}
/tmp/fs:2
/tmp/fg:258

In other words, I think there's a reason we haven't done this yet.
Comment 2 Marc Perkel 2002-08-04 17:52:25 UTC
Subject: Re:  Testing FROM addresses for spam

I am working on one with "free" but by using trick capitalization tricks I think 
it will work.

bugzilla-daemon@hughes-family.org wrote:
> http://www.hughes-family.org/bugzilla/show_bug.cgi?id=656
> 
> 
> 
> 
> 
> ------- Additional Comments From quinlan@pathname.com  2002-08-04 17:38 -------
> I'm not so sure about this.  I'm sure the next one to be suggested will be
> "free".  Great, except when you're working with tons of people from
> freestandards.org or something like that.  This type of rule seems like it
> could cause a lot of false matches.
> 
> fs = spam froms
> fg = nonspam froms
> 
> $ egrep -ic offer /tmp/{fs,fg}
> /tmp/fs:29
> /tmp/fg:0
> $ egrep -ic free /tmp/{fs,fg}
> /tmp/fs:58
> /tmp/fg:269
> $ egrep -ic freestandards /tmp/{fs,fg}
> /tmp/fs:2
> /tmp/fg:258
> 
> In other words, I think there's a reason we haven't done this yet.
> 
> 
> 
> 
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> 


Comment 3 Theo Van Dinter 2002-08-04 18:09:39 UTC
Subject: Re: [SAdev]  Testing FROM addresses for spam

On Sun, Aug 04, 2002 at 05:52:25PM -0700, bugzilla-daemon@hughes-family.org wrote:
> I am working on one with "free" but by using trick capitalization tricks I think 
> it will work.

I haven't checked, but don't most of the free ones start at the beginning
of the address, and the offer one is probably near the end?  So would:

/^free.{5,}\@/
/^.{5,}offer\@/

do the job?  Just curious.

Comment 4 Michael Moncur 2002-08-22 04:27:35 UTC
See also bug #663 and bug #662.
Comment 5 Duncan Findlay 2002-12-18 12:33:31 UTC
Added.