Bug 5285 - remove NJABL DUL rule in favour of PBL
Summary: remove NJABL DUL rule in favour of PBL
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P4 minor
Target Milestone: 3.2.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard: pre-perceptron
Keywords:
Depends on:
Blocks: 5270
  Show dependency tree
 
Reported: 2007-01-09 02:46 UTC by Justin Mason
Modified: 2007-02-08 06:28 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Mason 2007-01-09 02:46:44 UTC
apparently the PBL duplicates the NJABL DUL zone, and it's intended by
both maintainers that it'll take over (after a short handover period).

accordingly we'll need to remove RCVD_IN_NJABL_DUL before perceptron run.
Comment 1 Michael Parker 2007-01-09 05:51:37 UTC
With --reuse we can rename a rule once, should we use the rename trick to move
the  RCVD_IN_NJABL_DUL hits over to the PBL rule?  Or are we too worried about
additional FPs?
Comment 2 Justin Mason 2007-01-09 06:06:15 UTC
I think I'd prefer not to, since the spamhaus guys expect that many ISPs will be
creating their PBL accounts in the next few weeks, and modifying their IP
ranges' data.  Given that, I'd guess after a few months the data in PBL will
look pretty different from the original data in NJABL-DUL.

I think we'd be better off ignoring NJABL-DUL and doing the PBL mass-checks
using "fresh" results, no reuse at all...
Comment 3 Steve Halligan 2007-01-09 09:36:28 UTC
Is NJABL going to stop serving the DUL entirely?  If so, bummer.  If not, it 
should be kept as a rule.  NJABL is free for all, PBL is non-free for larger 
commercial entities.
Comment 4 Justin Mason 2007-01-09 09:40:39 UTC
yeah, I think they plan to stop serving it.
Comment 5 Justin Mason 2007-01-13 12:55:11 UTC
freqs:

http://ruleqa.spamassassin.org/20070113-r495852-n/RCVD_IN_NJABL_DUL/detail

0.00000  13.5293   1.2629   0.915    0.66    0.00  RCVD_IN_NJABL_DUL   

http://ruleqa.spamassassin.org/20070113-r495852-n/RCVD_IN_PBL/detail

0.00000  10.9582   1.2212   0.900    0.65    0.00  RCVD_IN_PBL   
Comment 6 Justin Mason 2007-01-17 07:06:27 UTC
: jm 1483...; svn commit -m "bug 5187: move RCVD_IN_PBL to main ruleset now that
it's been released; bug 5285: retire RCVD_IN_NJABL_DUL in favour of RCVD_IN_PBL"
rules/20_dnsbl_tests.cf  rulesrc/sandbox/jm/20_zen.cf
Deleting       rulesrc/sandbox/jm/20_zen.cf
Sending        rules/20_dnsbl_tests.cf
Transmitting file data .
Committed revision 497038.
Comment 7 Justin Mason 2007-01-19 09:05:57 UTC
for the record --


'Subject: NJABL announcement: dynablock & Spamhaus PBL
From: help mail.njabl.org
Date: Fri, 19 Jan 2007 11:37:29 -0500 (EST) (16:37 GMT)
To: list njabl.org

With the advent of Spamhaus's PBL (http://spamhaus.org/pbl/), 
dynablock.njabl.org has become obsolete.  Rather than maintain separate similar 
DNSBL zones, NJABL will be working with Spamhaus on the PBL. Effective 
immediately, dynablock.njabl.org exists as a copy of the Spamhaus PBL.  After 
dynablock users have had ample time to update their configurations, the 
dynablock.njabl.org zone will be emptied.

Other NJABL zones (i.e. dnsbl, combined, bhnc, and the qw versions) will 
continue, business as usual, except that combined will eventually lose its 
dynablock component.

If you currently use dynablock.njabl.org we recommend you switch immediately to 
pbl.spamhaus.org.

If you currently use combined.njabl.org, we recommend you add pbl.spamhaus.org 
to the list of DNSBLs you use.

You may also want to consider using zen.spamhaus.org, which is a combination 
zone consisting of Spamhaus's SBL, XBL, and PBL zones.'
Comment 8 Justin Mason 2007-02-05 09:11:52 UTC
reopening -- from prelim results I think we might be better off reusing the
NJABL-DUL hits for PBL after all.  let's see once all the results are in
Comment 9 Justin Mason 2007-02-08 06:28:32 UTC
 20.152  30.9970   0.1499    0.995   0.67    0.00  T_RCVD_IN_PBL_WITH_NJABL_DUL
  1.312   2.0028   0.0372    0.982   0.73    0.00  RCVD_IN_PBL

: jm 40...; grep RCVD_IN_PBL ham.log  | grep -v RCVD_IN_NJABL_DUL | wc -l
     648

: jm 41...; grep RCVD_IN_PBL ham.log  | wc -l
     877

: jm 42...; grep RCVD_IN_PBL spam.log  | grep -v RCVD_IN_NJABL_DUL | wc -l
   14913

: jm 43...; grep RCVD_IN_PBL spam.log  | wc -l
  264044

the hit-rate is a lot higher on T_RCVD_IN_PBL_WITH_NJABL_DUL, and most of
the RCVD_IN_PBL hits (94.4%) also hit RCVD_IN_NJABL_DUL.  looks good enough
to me.

: jm 458...; svn commit -m "bug 5285: reuse NJABL_DUL Dynablock hits as input
for RCVD_IN_PBL during the perceptron run" rules/20_dnsbl_tests.cf 
rulesrc/sandbox/jm/20_basic.cf
Sending        rulesrc/sandbox/jm/20_basic.cf
Sending        rules/20_dnsbl_tests.cf
Transmitting file data ..
Committed revision 504908.