Bug 1934

Summary: Catch repeated obfuscating comments
Product: Spamassassin Reporter: chris <chris>
Component: RulesAssignee: SpamAssassin Developer Mailing List <dev>
Status: NEW ---    
Severity: enhancement    
Priority: P5    
Version: 2.54   
Target Milestone: Future   
Hardware: Other   
OS: other   
Whiteboard: needs code

Description chris 2003-05-20 12:42:44 UTC
# Notes: 6 is an aribitrary number.  Less may work better
# Also, if false positives, we might want to pair this with OBFUSCATING_COMMENT
# in a meta test, but I don't think it's needed

rawbody		RepeatedComment		/<\!([^>]*)>.+<\!\1>.+<\!\1>.+<\!
\1>.+<\!\1>.+<\!\1>/is
score		RepeatedComment 	2
describe	RepeatedComment 	Same HTML comment was repeated 6+ times
Comment 1 Daniel Quinlan 2003-05-20 16:38:06 UTC
Subject: Re: [SAdev]  New: Catch repeated obfuscating comments

> Notes: 6 is an aribitrary number.  Less may work better Also, if false
> positives, we might want to pair this with OBFUSCATING_COMMENT in a
> meta test, but I don't think it's needed

Very interesting idea!  We should also try putting some perl code to
catch repeated comments in HTML.pm.  Try both repeated in sequence and
repeated throughout the entire message.  It will probably be cheaper and
perhaps more accurate than using a backtracking rawbody test.

Daniel

Comment 2 chris 2003-05-20 17:00:27 UTC
I suppose we should remove the !s from my test, to fit with the new style 
spams I have seen that format comments <like this>.

Also, I like the idea of coding this test in the Perl code.

Other ideas are to count one-long-word comments, or comments with random non-
letter characters in them.
Comment 3 Daniel Quinlan 2005-03-30 01:08:26 UTC
move bug to Future milestone (previously set to Future -- I hope)