Bug 3250 - MARKETING_PARTNERS false positive
Summary: MARKETING_PARTNERS false positive
Status: RESOLVED WORKSFORME
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 2.63
Hardware: Other other
: P3 minor
Target Milestone: 3.1.0
Assignee: Daniel Quinlan
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-04-07 13:50 UTC by Ben Hutchings
Modified: 2005-04-09 17:38 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Confirmation message from Easyjet that matches this rule text/plain None Ben Hutchings [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Hutchings 2004-04-07 13:50:01 UTC
The regex for the MARKETING_PARTNERS rule is:
/\b(?:marketing|network) partner|\bpartner (?:web)?site/i

This matched a confirmation message for some plane tickets I just bought and
together with the HTML tests pushed it over the limit. It's fairly common for
airlines to refer to partner web sites (for hotels, car rental, etc) and this is
liable to match a lot of such confirmations. This could be disastrous for
someone who failed to note down the reference number for a ticketless booking.
Either the regex should be changed to require a longer phrase or the default
weights for this rule should be reduced from the current values (up to 3.5).
Comment 1 Daniel Quinlan 2004-07-27 00:23:48 UTC
The weight for this rule is going to only be about 0.75 to 1.9.  Seems
reasonable.

If you attach an example message, maybe we could attempt to fix the false
match.  Otherwise, I suggest we close this as WONTFIX.
Comment 2 Daniel Quinlan 2004-07-27 00:24:09 UTC
not even close to being a major problem
Comment 3 Ben Hutchings 2004-07-27 02:23:43 UTC
Created attachment 2180 [details]
Confirmation message from Easyjet that matches this rule

I have replaced some of the personal information in the mail with the text
"[deleted]" and hope that this doesn't affect the result.
Comment 4 Daniel Quinlan 2004-08-27 16:59:19 UTC
moving accuracy and some bugs to 3.1.0 milestone
Comment 5 Justin Mason 2005-01-26 00:19:13 UTC
wow, I was considering whitelist easyjet -- but they don't even have reverse DNS
in that message!
Comment 6 Justin Mason 2005-03-11 15:26:13 UTC
ok, here's a potential replacement with a negative lookahead to block the
Easyjet FP:

body T_MARKETING_PARTNERS       /\b(?:marketing|network) partner|\bpartner
(?:web)?site\b(?! for more information)/i   


NEEDSMC
Comment 7 Auto-Mass-Checker 2005-03-13 15:50:08 UTC
# [automatically generated by automc: start]
# DONEMC 6: completed request from comment 6

  0.145   0.1810   0.0029    0.984   0.59    0.01  T_MC_MARKETING_PARTNERS_b3250_c6

above freqs using data from "/home/automc/corpus/html/DETAILS.new" as of Sun Mar 13 15:50:05 2005:

T_MC_MARKETING_PARTNERS_b3250_c6 = T_MARKETING_PARTNERS from bug 3250 comment 6
full freqs: http://bugzilla.spamassassin.org/ruleqa?rule=T_MC_MARKETING_PARTNERS_b3250_c6&date=20050313
# ham results used: ham-cthielen.log ham-daf.log ham-quinlan.log ham-rODbegbie.log ham-theo.log
# spam results used: spam-cthielen.log spam-daf.log spam-quinlan.log spam-rODbegbie.log spam-theo.log
 346646   276788    69858    0.798   0.00    0.00  (all messages)
100.000  79.8475  20.1525    0.798   0.00    0.00  (all messages as %)
# [automatically generated by automc: end]
Comment 8 Justin Mason 2005-03-13 23:18:28 UTC
hmm, that's not too helpful:

0.235   0.2926   0.0086    0.971   0.63    2.02  MARKETING_PARTNERS
0.145   0.1810   0.0029    0.984   0.59    0.01  T_MC_MARKETING_PARTNERS_b3250_c6
Comment 9 Daniel Quinlan 2005-04-08 02:12:34 UTC
I checked in some test rules.
Comment 10 Daniel Quinlan 2005-04-10 01:38:21 UTC
splitting the rule has no effect on efficacy, closing as WORKSFORME

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 279798   212073    67725    0.758   0.00    0.00  (all messages)
100.000  75.7950  24.2050    0.758   0.00    0.00  (all messages as %)
  0.201   0.2626   0.0059    0.978   0.58    2.02  MARKETING_PARTNERS
  0.128   0.1674   0.0030    0.983   0.54    0.01  T_MARKETING_PARTNERS
  0.073   0.0953   0.0030    0.970   0.50    0.01  T_PARTNER_SITE