Bug 5820 - URIBL_SBL from zen?
Summary: URIBL_SBL from zen?
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 3.2.4
Hardware: All All
: P4 enhancement
Target Milestone: 3.3.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-11 00:20 UTC by Henrik Krohns
Modified: 2008-04-10 07:04 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
Add uridnssub function to 3.2.4 patch None Henrik Krohns [HasCLA]
Make URIBL_SBL use uridnssub and zen patch None Henrik Krohns [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Henrik Krohns 2008-02-11 00:20:31 UTC
Is there a reason sbl.spamhaus.org is used over zen for URIBL_SBL? Zen does 
include sbl.

While a marginal case, I think there is a slightly better chance of caching if 
zen is used for everything.
Comment 1 Jeff Chan 2008-02-11 00:34:17 UTC
zen probably should not be used in this application, and arguably nor should
XBL.  SBL is specifically a list of spam spoort spaces, (DNS, www, MX, etc.
servers).  XBL is a list of compromised hosts, mostly botnets, mostly senders,
though some fast flux web hosting also.  PBL is a list of spaces that Spamhaus
believes and ISPs specify should not be emitting mail, but could host
*legitimate nameservers, web servers, etc.*  PBL is anything but mailservers. 
(zen is a combination of SBL, XBL and PBL into a single list.)  SBL is the most
correct list to use in this test.  PBL, though zen, could FP on legitimate
nameservers, legitimate web servers, etc.

http://www.spamhaus.org/pbl/index.lasso

"The Policy Block List

The Spamhaus PBL is a DNSBL database of end-user IP address ranges which should
not be delivering unauthenticated SMTP email to any Internet mail server except
those provided for specifically by an ISP for that customer's use. [...]

The PBL lists both dynamic and static IPs, any IP which by policy (whether the
block owner's or -interim in its absence- Spamhaus' policy) should not be
sending email directly to the MX servers of third parties. [...]

Do not use PBL in filters that do any ‘deep parsing’ of Received headers, or for
other than checking IP addresses that hand off to your mailservers."
Comment 2 Jeff Chan 2008-02-11 00:34:52 UTC
spoort --> support
Comment 3 Henrik Krohns 2008-02-11 00:40:18 UTC
Ahem, maybe you didn't get the point.

Zen includes SBL, so you can check for SBL _only_ with result 127.0.0.2. Just 
like RCVD_IN_SBL check zen.

Comment 4 Jeff Chan 2008-02-11 01:11:45 UTC
SBL is the only list that should be used to check URIs.  Using zen may be an
advantage where it's already used locally by the MTA and therefore cached in
DNS* or hosted locally in rbldnsd, but generally zen is not appropriate.

* The mail sender IPs checked by the MTA or RCVD_IN_SBL may not overlap much
with the URI namesever IPs checked by uridnsbl, so any caching advantage is
probably minimal.  The main advantage would be where zen is locally mirrored
using rbldnsd, but SBL isn't.
Comment 5 Henrik Krohns 2008-02-11 01:16:12 UTC
Atleast the other way around there might be a benefit, using sbl there is none. 
But I leave it up to you, I will change it locally anyway. :)

Comment 6 Justin Mason 2008-02-11 01:49:05 UTC
I'll ask the Spamhaus guys.
Comment 7 Jeff Chan 2008-02-11 02:01:58 UTC
The Spamhaus guys will tell you that PBL is inappropriate for URI checking for
the reason I gave earlier.  The only advantage would be caching of zen.
Comment 8 Justin Mason 2008-02-11 07:42:44 UTC
(In reply to comment #7)
> The Spamhaus guys will tell you that PBL is inappropriate for URI checking for
> the reason I gave earlier.  

yes, we (still) know that ;)

> The only advantage would be caching of zen.

Larry says --

>> >  >in other words would it be more efficient to query zen instead of SBL, if
>> >  >we only want SBL data?
>> >
>> >  Well, that's a hard call.  The "more efficient" could be variable,
>> >  and probably small.  As you well know, URIBL_SBL is not the same as
>> >  RCVD_IN_SBL.  The connecting IP will almost never been the URI's
>> >  IP.  Early Storm may have done this, but now it sends from one bot
>> >  with an IP or URI of another.  So, the DNS caching does not really
>> >  buy anything.
>> >
>> >  But in general, I'd guess it's a bit more efficient (for both the
>> >  user and for us) as we figure (and hope) most places will query
>> >  zen.spamhaus.org for up-front blocking, or if not that, at least in
>> >  SA to then do the spam-tests.  This then caches all the answers from
>> >  all 3 zones (or more) which any later queries have local access to.
>> >
>> >  There would be some "funny math" as to the efficiency if we ever
>> >  change the TTL's on individual zones, but I don't think we're
>> >  planning on doing that.
>> >
>> >  Hope that answers?
>>
>>so you're saying it might be marginally more effective for us to query
>>Zen for those URIBL_SBL lookups?
>
>Yep.  It won't be less effective, parity is seldom reached in
>designs, so it would be more effective.  But not effective on the
>huge % of spam that is botnet spam, but can be effective on the %
>that is from static blocks where spam is sent and pages are
>hosted.  So, whatever % that is, will be the increase.

(it's just occurred to me -- it'll improve matter for the *non* spam case
too.)
Comment 9 Henrik Krohns 2008-04-10 00:03:05 UTC
Created attachment 4294 [details]
Add uridnssub function to 3.2.4
Comment 10 Henrik Krohns 2008-04-10 00:03:49 UTC
Created attachment 4295 [details]
Make URIBL_SBL use uridnssub and zen
Comment 11 Justin Mason 2008-04-10 07:04:43 UTC
checked in to 3.3.0, thanks Henrik:

: jm 607...; svn commit -m "bug 5820: add 'uridnssub' keyword for URIDNSBL plugin; fix URIBL_SBL to use this keyword, and thereby produce a marginal gain in efficiency for lookups on zen.spamhaus.org" rules lib/Mail/SpamAssassin/Plugin
Sending        lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm
Sending        rules/25_uribl.cf
Transmitting file data ..
Committed revision 646805.

also updated CREDITS to note your contribution ;)  we should probably get a CLA from you if there's much more on the way...