Bug 5278 - __UNUSABLE_MSGID is hiding FPs; let's remove it
Summary: __UNUSABLE_MSGID is hiding FPs; let's remove it
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (Eval Tests) (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 major
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-01-05 10:50 UTC by Justin Mason
Modified: 2007-01-11 05:51 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Mason 2007-01-05 10:50:45 UTC
quoting from bug 4960:

'[a mail] no longer FPs (in 3.1.7 with sa-update) , but
only because it triggers __UNUSABLE_MSGID which prevents MSGID_DOLLARS
triggering.'

This is a very good point -- __UNUSABLE_MSGID calls
HeaderEval::check_messageid_not_usable(), which contains this code:

  # too old; older versions of clients used different formats
  return 1 if ($self->received_within_months($pms, '6','undef'));


In other words, no message over 6 months old will display a MSGID_* rule hit,
because it's been blocked by this rule.  I think at the time we put it in, it
made sense, but since then, Message-IDs have been pretty sane in ham.  in the
meantime, it's now hiding false positives in the ham corpora, which are
generally older than spam corpora (e.g. my spam goes back 5 months or so, but my
ham collection is up to 2 years old.)

I propose removing those 2 lines ASAP so we can get more accurate ideas of FP
rates on the rules that use it:

FORGED_MUA_MOZILLA
FORGED_MUA_IMS
__FORGED_OE
__FORGED_OUTLOOK_DOLLARS
FORGED_MUA_OIMO
FORGED_MUA_EUDORA
TVD_FW_GRAPHIC_ID3
TVD_FW_GRAPHIC_ID3_2
Comment 1 Justin Mason 2007-01-11 05:51:54 UTC
ok, here's results for the rules that use it:

http://ruleqa.spamassassin.org/?daterev=20070110-r494768-n&rule=%2F%28FORGED%7CTVD_FW_GRAPHIC%7CUNUSABLE%29&srcpath=&g=Change

0.00000   5.2828   0.0281   0.995    0.92    0.00  FORGED_MUA_OUTLOOK5278   
0.00000   5.2386   0.0179   0.997    0.93    3.36  FORGED_MUA_OUTLOOK

definitely some hidden FPs there worth knowing about, spam% goes up a little.

0.00000   4.7496   0.0281   0.994    0.91   (n/a)  __FORGED_OE5278   
0.00000   4.7079   0.0179   0.996    0.92   (n/a)  __FORGED_OE

ditto (probably the same msgs)

0.00000   0.5332   0.0000   1.000    0.81   (n/a)  __FORGED_OUTLOOK_DOLLARS5278   
0.00000   0.5307   0.0000   1.000    0.80   (n/a)  __FORGED_OUTLOOK_DOLLARS

0.00000   0.3351   0.0000   1.000    0.75    0.00  FORGED_MUA_EUDORA5278   
0.00000   0.3318   0.0000   1.000    0.75    2.44  FORGED_MUA_EUDORA

0.00000   0.3032   0.0000   1.000    0.73    0.00  FORGED_MUA_OIMO5278   
0.00000   0.3010   0.0000   1.000    0.73    1.21  FORGED_MUA_OIMO

all just good news.

0.00000   0.2597   0.0013   0.995    0.71    0.00  FORGED_MUA_IMS5278   
0.00000   0.2568   0.0000   1.000    0.71    2.48  FORGED_MUA_IMS

A hidden FP; worth tracking.

0.00000   0.1856   0.0281   0.868    0.64    0.00  T_TVD_FW_GRAPHIC_ID3_2_5278   
0.00000   0.1833   0.0013   0.993    0.67    1.00  TVD_FW_GRAPHIC_ID3_2

0.00000   0.1850   0.0281   0.868    0.64    0.00  T_TVD_FW_GRAPHIC_ID3_5278   
0.00000   0.1827   0.0013   0.993    0.67    1.00  TVD_FW_GRAPHIC_ID3

plenty of hidden FPs (again, probably the same messages), worth knowing about!

0.00000   0.0594  19.0556   0.003    0.00   (n/a)  __UNUSABLE_MSGID5278   
0.00000   1.9085  30.7667   0.058    0.33   (n/a)  __UNUSABLE_MSGID

looks about right... there's still a lot of ham we ignore though - probably
ezmlm lists or "gated_through_received_hdr_remover" hits.  At some point we 
should investigate those too...

Anyway, in the meantime, I've applied this.

: jm 1061...; svn commit -m "bug 5278: remove 6-month limit imposed via the
__UNUSABLE_MSGID rule on FORGED_MUA_* rules which use Message-ID header"
rulesrc/sandbox/jm/20_basic.cf  lib/Mail/SpamAssassin/Plugin
Sending        rulesrc/sandbox/jm/20_basic.cf
Sending        lib/Mail/SpamAssassin/Plugin/HeaderEval.pm
Transmitting file data ..
Committed revision 495220.