Bug 1017 - Patch to enable domain-based blacklists, from domain checking
Summary: Patch to enable domain-based blacklists, from domain checking
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P4 enhancement
Target Milestone: 2.60
Assignee: Daniel Quinlan
URL: http://www.rfc-ignorant.org/how_to_do...
Whiteboard:
Keywords: dns
Depends on:
Blocks:
 
Reported: 2002-09-22 22:33 UTC by Allen Smith
Modified: 2003-04-30 05:34 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Patches to lib/Mail/SpamAssassin for enhanced RBL checking patch None Allen Smith [NoCLA]
patch to be applied really soon patch None Daniel Quinlan [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Allen Smith 2002-09-22 22:33:33 UTC
Hi. I will be attaching a patch that:
	A. Adds a new type of RBL, namely domain-based (look up
		example.tld.rbl_domain instead of 2.0.0.127.rbl_domain);
	B. Adds a way to check said type of RBL using the 'From' addresses,
		which is useful for dsn.rfc-ignorant.org (see
		http://www.rfc-ignorant.org), sender-domain.sjesl.monkeys.com,
		and I suspect eventually others - see below for a (tested)
		example rule;
	C. Makes sure that rbls are checked from the one contributing the
		highest score on down, which makes sure that the check for
		something already being listed in a given set doesn't result in
		skipping blacklists that would contribute more toward the score
		than the other blacklist already has.
The example, which consults dsn.rfc-ignorant.org:

header T_FROM_IN_RFCI_DSN     
rbleval:check_from_rbl('from','dsn.rfc-ignorant.org')
describe T_FROM_IN_RFCI_DSN    A from address has a domain in
dsn.rfc-ignorant.org
tflags T_FROM_IN_RFCI_DSN      net
score T_FROM_IN_RFCI_DSN       2.0

	-Allen
Comment 1 Allen Smith 2002-09-22 22:34:33 UTC
Created attachment 350 [details]
Patches to lib/Mail/SpamAssassin for enhanced RBL checking
Comment 2 Daniel Quinlan 2002-09-22 23:40:41 UTC
I'll take a look.
Comment 3 Daniel Quinlan 2002-09-22 23:41:26 UTC
I'll take a look at the patch, sounds like a good idea to me.
Comment 4 Allen Smith 2002-10-19 16:13:19 UTC
Slight revision to quiet perl -w (EvalTests.pm):

  my $already_matched_in_other_zones = ' ';
  if (exists($self->{$set}) && defined($self->{$set}) &&
      exists($self->{$set}->{rbl_matches_found}) &&
      defined($self->{$set}->{rbl_matches_found})) {
    $already_matched_in_other_zones .= 
      $self->{$set}->{rbl_matches_found}.' ';
  }

Other than this, it's working great here from testing on spamtraps -
dsn.rfc-ignorant.org is used here as an absolute blocker, so I can't tell how
well it works on normal accounts.

	-Allen
Comment 5 Daniel Quinlan 2003-04-30 04:04:47 UTC
Sorry it's taken me so long to get to this ticket.  Now that the DNS code
has been redesigned, it's probably time to finish this bug.  The only thing
really remaining is the DSN test.

> A. Adds a new type of RBL, namely domain-based (look up
> example.tld.rbl_domain instead of 2.0.0.127.rbl_domain);

This now works out of the box, no changes are required to Dns.pm, just
eval tests to do extract the domains and start RBL tests.

> B. Adds a way to check said type of RBL using the 'From' addresses,
> which is useful for dsn.rfc-ignorant.org (see
> http://www.rfc-ignorant.org), sender-domain.sjesl.monkeys.com,
> and I suspect eventually others - see below for a (tested)
> example rule;

The DSN rule looks quite good in my testing.  I think there's a good chance
that we will use it.  Somewhere around 10% of my spam hits the rule and
about 85% of the spam has scores under 5!  FPs are fairly low too.

> C. Makes sure that rbls are checked from the one contributing the
> highest score on down, which makes sure that the check for
> something already being listed in a given set doesn't result in
> skipping blacklists that would contribute more toward the score
> than the other blacklist already has.

This isn't an issue in the redesigned DNS code.  If a old query is duplicated
by a new query, the new set is chained onto the previous query.  That's it.

Results for my most recent 3000 spam and 3000 ham:

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
   6000     3000     3000    0.500   0.00    0.00  (all messages)
100.000  50.0000  50.0000    0.500   0.00    0.00  (all messages as %)
  6.300  12.4000   0.2000    0.984   0.88    0.01  T_FROM_IN_RFCI_DSN
  0.350   0.6667   0.0333    0.952   0.79    0.01  T_FROM_IN_DEADBEEF
  0.667   1.1667   0.1667    0.875   0.61    0.01  T_FROM_IN_PIGS
Comment 6 Daniel Quinlan 2003-04-30 04:06:30 UTC
Created attachment 925 [details]
patch to be applied really soon
Comment 7 Daniel Quinlan 2003-04-30 13:34:43 UTC
Oops, missing a tflags in that last patch.  I fixed that, retested
the code with the new hostname_to_domain2() function vs. the
existing hostname_to_domain() function (no difference for some reason),
and checked the code into 2.60-cvs.

Closing bug.