SA Bugzilla – Bug 656
Testing FROM addresses for spam
Last modified: 2002-12-18 03:33:31 UTC
I think I stumbled onto a new frontier. I got the idea of checking the FROM address because I noticed that the word OFFER(S) was often used in the address and what a score! Got 1149 hits and only one FP and that one might be a spam in my non-spam corpus. Here's the rule - you try it. header FROM_OFFER From =~ /offer/i describe FROM_OFFER From address contains OFFER OVERALL SPAM NONSPAM NAME 14425 9610 4815 (all messages) 1150 1149 1 FROM_OFFER
I'm not so sure about this. I'm sure the next one to be suggested will be "free". Great, except when you're working with tons of people from freestandards.org or something like that. This type of rule seems like it could cause a lot of false matches. fs = spam froms fg = nonspam froms $ egrep -ic offer /tmp/{fs,fg} /tmp/fs:29 /tmp/fg:0 $ egrep -ic free /tmp/{fs,fg} /tmp/fs:58 /tmp/fg:269 $ egrep -ic freestandards /tmp/{fs,fg} /tmp/fs:2 /tmp/fg:258 In other words, I think there's a reason we haven't done this yet.
Subject: Re: Testing FROM addresses for spam I am working on one with "free" but by using trick capitalization tricks I think it will work. bugzilla-daemon@hughes-family.org wrote: > http://www.hughes-family.org/bugzilla/show_bug.cgi?id=656 > > > > > > ------- Additional Comments From quinlan@pathname.com 2002-08-04 17:38 ------- > I'm not so sure about this. I'm sure the next one to be suggested will be > "free". Great, except when you're working with tons of people from > freestandards.org or something like that. This type of rule seems like it > could cause a lot of false matches. > > fs = spam froms > fg = nonspam froms > > $ egrep -ic offer /tmp/{fs,fg} > /tmp/fs:29 > /tmp/fg:0 > $ egrep -ic free /tmp/{fs,fg} > /tmp/fs:58 > /tmp/fg:269 > $ egrep -ic freestandards /tmp/{fs,fg} > /tmp/fs:2 > /tmp/fg:258 > > In other words, I think there's a reason we haven't done this yet. > > > > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. >
Subject: Re: [SAdev] Testing FROM addresses for spam On Sun, Aug 04, 2002 at 05:52:25PM -0700, bugzilla-daemon@hughes-family.org wrote: > I am working on one with "free" but by using trick capitalization tricks I think > it will work. I haven't checked, but don't most of the free ones start at the beginning of the address, and the offer one is probably near the end? So would: /^free.{5,}\@/ /^.{5,}offer\@/ do the job? Just curious.
See also bug #663 and bug #662.
Added.