Bug 5822 - try URIBL_XBL
Summary: try URIBL_XBL
Status: RESOLVED WORKSFORME
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 3.2.4
Hardware: Other other
: P5 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-11 07:43 UTC by Justin Mason
Modified: 2008-04-15 15:49 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Mason 2008-02-11 07:43:34 UTC
basically, test URIs against Spamhaus XBL, similar to URIBL_SBL.
I'm sure we did this before, but I can't seem to find any
measured results!   so let's try it again.
Comment 1 Justin Mason 2008-02-11 07:44:56 UTC
: jm 10...; svn commit -m "bug 5822: try out URIBL_XBL (again)"
rulesrc/sandbox/jm/20_basic.cf
Sending        rulesrc/sandbox/jm/20_basic.cf
Transmitting file data .
Committed revision 620506.
Comment 2 Justin Mason 2008-03-16 05:59:14 UTC
results from last week's network run:

http://ruleqa.spamassassin.org/20080308-r634908-n/URIBL_XBL/detail

0.00000   0.0738   0.0056   0.930    0.70    0.00  URIBL_XBL  
0.00000   2.2179   0.0261   0.988    0.85    0.00  URIBL_XBL net-bb-jm 
0.00000   0.0000   0.0000   0.500    0.40    0.00  URIBL_XBL net-bb-zmi 
0.00000   0.0000   0.0296   0.000    0.51    0.00  URIBL_XBL net-cthielen 
0.00000   0.0000   0.0000   0.500    0.49    0.00  URIBL_XBL net-dos 
0.00000   0.0000   0.0000   0.500    0.47    0.00  URIBL_XBL net-theo 
0.00000   6.2315   0.0162   0.997    0.63    0.00  URIBL_XBL net-zmi 

so about 93% accurate overall, hitting 0.0738% of spam.  I think something must be up there, though -- so ignore the submitters apart from bb-jm and zmi, and you get a more realistic 98-99% accuracy, hitting a low 2.2% / 6.2% of spam.

I'll try removing #reuse, to see if that improves figures:

: jm 574...; svn commit -m "remove #reuse from URIBL_XBL and see if it improves measurement of its accuracy" rulesrc/sandbox/jm/20_basic.cf
Sending        rulesrc/sandbox/jm/20_basic.cf
Transmitting file data .
Committed revision 637582.
Comment 3 Justin Mason 2008-04-15 15:49:25 UTC
some results:

0.00000   0.7372   0.0433   0.945    0.83    0.00  URIBL_XBL  
0.00000   0.3272   0.0000   1.000    0.81    0.00  URIBL_XBL net-dos 
0.00000   0.0000   0.0160   0.000    0.48    0.00  URIBL_XBL net-jm 
0.00000   1.8209   0.1489   0.924    0.81    0.00  URIBL_XBL net-theo 
0.00000   0.0895   0.0000   1.000    0.52    0.00  URIBL_XBL net-zmi 

pretty wildly variable!  94.5% accurate overall, with a pretty low hit-rate of only 0.737% spam.
I don't think it works out all that well.