Bug 6637 - FS_REPLICA and FS_REPLICAWATCH too broad
Summary: FS_REPLICA and FS_REPLICAWATCH too broad
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 3.3.1
Hardware: HP Linux
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-22 19:53 UTC by Tony Grobe
Modified: 2011-07-23 17:47 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Grobe 2011-07-22 19:53:01 UTC
Michael Scheidell suggested on the mailing list that FS_REPLICA and FS_REPLICAWATCH are too broad and/or overlap. From 72_active.cf:

header   FS_REPLICA             Subject =~ /replica/i
describe FS_REPLICA             Subject says "replica"

header   FS_REPLICAWATCH        Subject =~ /replica watch/i
describe FS_REPLICAWATCH        Subject says Replica watch

David Skoll suggested Subject =~ /\breplica\b/i as a potential improvement for the first, but I'm unsure what can be done about the overlap. Perhaps a combination of metas would be an improvement? Something like:

header  __FS_REPLICA            Subject =~ /\breplica\b/i
header  __FS_REPLICAWATCH       Subject =~ /replica watch/i
meta    FS_REPLICA              __FS_REPLICA || __FS_REPLICAWATCH

I'm not sure what legitimate subject would hit both rules, but Mike is right about the effect of both rules firing on one message:

50_scores.cf:score FS_REPLICA 1.630 3.599 2.028 3.599 # n=2
50_scores.cf:score FS_REPLICAWATCH 3.237 1.715 1.733 3.015 # n=2
Comment 1 John Hardin 2011-07-23 17:26:07 UTC
Odd, there are a lot of single-word-in-subject tests in that file that _are_ properly \b delimited.

I found a couple of other dangerous undelimited tests in addition to FS_REPLICA.

Proposed patch, running local masscheck now:

Index: 00_FVGT_File001.cf
===================================================================
--- 00_FVGT_File001.cf	(revision 1150163)
+++ 00_FVGT_File001.cf	(working copy)
@@ -2400,7 +2400,7 @@
 #counts   FS_CHEAP_CAP             8s/0h of 47019 corpus (37183s/9836h FVGT) 12/23/06
 
 
-header   FS_CAILIS              Subject =~ /cailis/i
+header   FS_CAILIS              Subject =~ /\bcailis\b/i
 describe FS_CAILIS              Subject says cailis
 ##score    FS_CAILIS            10.357
 #counts   FS_CAILIS                13s/0h of 206859 corpus (199363s/7496h FT) 12/13/05
@@ -2694,7 +2694,8 @@
 #counts   FS_REFI                  8s/0h of 47019 corpus (37183s/9836h FVGT) 12/23/06
 
 
-header   FS_REPLICA             Subject =~ /replica/i
+header   __FS_REPLICA           Subject =~ /\breplica\b/i
+meta     FS_REPLICA             __FS_REPLICA && !FS_REPLICAWATCH
 describe FS_REPLICA             Subject says "replica"
 ##score    FS_REPLICA           0.994
 #counts   FS_REPLICA               335s/0h of 70341 corpus (31030s/39311h DOC) 12/13/05
@@ -2704,7 +2705,7 @@
 #counts   FS_REPLICA               92s/0h of 47019 corpus (37183s/9836h FVGT) 12/23/06
 
 
-header   FS_REPLICAWATCH        Subject =~ /replica watch/i
+header   FS_REPLICAWATCH        Subject =~ /replica watch\b/i
 describe FS_REPLICAWATCH        Subject says Replica watch
 ##score    FS_REPLICAWATCH      10.357
 #counts   FS_REPLICAWATCH          110s/0h of 206859 corpus (199363s/7496h FT) 12/13/05
Comment 2 John Hardin 2011-07-23 17:46:12 UTC
Tests pass.

jhardin@dendarii ~/develop/spamassassin/svn/trunk/rulesrc/sandbox/emailed $ svn commit -m 'address bug 6637'
Sending        emailed/00_FVGT_File001.cf
Transmitting file data .
Committed revision 1150178.
Comment 3 John Hardin 2011-07-23 17:47:04 UTC
Fixed for now.