Bug 4013 - Please add IP address host checking to urirhsbl and urirhssub
Summary: Please add IP address host checking to urirhsbl and urirhssub
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P5 normal
Target Milestone: 3.1.0
Assignee: Daniel Quinlan
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-12-03 00:01 UTC by Jeff Chan
Modified: 2005-04-05 19:22 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Chan 2004-12-03 00:01:39 UTC
It appears that urirhssub and probably urirhsbl don't currently check URIs with
IP address host portions like http://1.2.3.4/ .  Justin thought that may have
been an unintentional oevrsight.  If so, then please add support for IPs back
into the URI checking.  SURBLs are mostly domains but do have a few IP
addresses, particularly from phishing.
Comment 1 Theo Van Dinter 2004-12-03 07:17:46 UTC
Cut/pasting my response to the mail thread about this since it explains what is currently happening:

The code does so explicitly, but since I didn't write the plugin so I
can't say why it does what it does:
  
  if ($dom =~ /^\d+\.\d+\.\d+\.\d+$/) {
    $self->lookup_dnsbl_for_ip ($scanstate, $obj, $dom);
  }
  else {
    # look up the domain in the RHSBL subset
[...]

uridnsbl flags rules as active_rules_revipbl, which lookup_dnsbl_for_ip
uses to know which rules to send IPs for.  revipbl rules also get domains
queried.

urirhs* are active_rules_rhsbl, which just get domains.

    if ($rulecf->{is_rhsbl}) {
      $scanstate->{active_rules_rhsbl}->{$rulename} = 1;
    } else {
      $scanstate->{active_rules_revipbl}->{$rulename} = 1;
    }

is the code in question.  I don't know why there's a difference.  Perhaps
instead of rule types specifying the difference it should be a tflag?  Assume
they all do both IP and domain unless "tflags nouriip" is set or something?
Comment 2 Justin Mason 2004-12-03 10:08:51 UTC
Subject: Re:  Please add IP address host checking to urirhsbl and urirhssub 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


ah, I remember this now.  it's entirely to do with looking stuff up in
SBL-type "lookup domain NS in blocklist" zones.  In the domain-name case
we do:

    - resolve domain name's NS
    - resolve those into A records
    - lookup addresses in SBL

but if it's an IP address we can skip straight to the very last step,
hence the optimisation.  However, that assumption fails when you're
looking it up in SURBL or any similar "lookup domain name in rhsbl" zone;
I guess I never checked that!  (I should have added a test.)

The fix is to have that "optimization" conditional on !is_rhsbl.

BTW, that flowchart above is one of the reasons where there's two separate
rule types, and why there's a difference between them. "lookup domain NS
in blocklist" is entirely different behaviour from "lookup domain name in
rhsbl", due to the extra steps and different lookups required.

I'm -1 on the idea of changing them to a single rule type, differentiated
using a tflag -- throwing in random tflags is messy as we just wind up
cluttering the tflags space with a tflag that only makes sense for URIBL
rules. It makes more sense to keep "tflags" as a space for flags that work
for all, or at least most, rule types.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBsKujMJF5cimLx9ARAqKTAJ0ej7vWN/wg5iRCqK6hh5XlSm/VhACffdYn
glauborxGPxufSAD/PMrbEI=
=qz7E
-----END PGP SIGNATURE-----

Comment 3 Jeff Chan 2004-12-03 19:00:52 UTC
For that matter, please make a test that does not do NS and A lookups for SURBL
checks (if it currentlyy is).  NS resolution is broken in some proxies, and
SURBL lookups would be much more efficient without any NS or A checks.

Here are Dallas Engelken's comments about the broken resolver:

"Well, the platform is not really the issue here I don't believe... The
surrounding platform is.  Their SA is running on redhat 7.3, perl 5.6.1.
They have this Symantec Raptor software firewall running on an NT box,
and for some reason, it doesn't resolve NS resource record queries.
Been on the phone with them (symantec) about it, they say they have
never supported it in the DNS proxy, but in their new version, support
for it is experimental.  I was like WTF, its just like any other DNS
lookup.

So anyways, if you run a  'host -tNS domain.com', you get no answers.
Not even a NXDOMAIN.   This causes SA to think DNS is not available and
not run any network tests unless you hard code it.   Even then SURBL
does not work because it does a NS lookup before pulling the A records.
Dunno why, that's why I'm asking for feedback here.

I guess it is possible that even when NS records are properly working on
a good setup, that NS's are still being pulled for the URI domains that
are being compared against SURBL.  Whether this is creating false
positives or not is another thing.  I'm still trying to figure out
how/if the NS lookups are being used in SA as it pertains to SURBL.  I
realize the NS's are resolved and compared against SBL, and maybe this
is why NS's are pulled for all URIs, but its just a DNS lookup that is
not needed IMHO... And with lots of URIs to lookup, this could cause it
to take twice as long on the surbl lookups."

Perhaps this should be a separate ticket, but the issues may be related.
Adding Dallas to ticket notification.
Comment 4 Jeff Chan 2004-12-14 16:10:11 UTC
FWIW I seem to be getting slightly more spam using IP addresses in URIs.  We may
want to consider bumping up the need for this update.

Also the issue of unnecessary NS and A lookups on urirhssub/bl is probably still
quite relevant.  If they are being done unnecessarily, SA will be speeded up by
not doing them.
Comment 5 Justin Mason 2005-02-17 15:12:30 UTC
we do need to do this. seeing a lot of this in the wild...
Comment 6 Daniel Quinlan 2005-04-06 02:49:28 UTC
decided to get this one too

It's a bit confusing (and took me a while to figure out) that the IP addresses in
SURBL are reversed quads, but I suppose that makes sense if you want to do any
sort of wildcarding.
Comment 7 Daniel Quinlan 2005-04-06 02:53:36 UTC
FIXED in SVN HEAD
Comment 8 Jeff Chan 2005-04-06 03:22:10 UTC
Thanks for the fix DQ.  This will make a big difference for detecting phishes
which often use numeric URIs.

Reversing the dotted quads is traditional for RBLs AFIAK:

# dig 2.0.0.127.sbl.spamhaus.org
[...]
2.0.0.127.sbl.spamhaus.org.  2H IN A  127.0.0.2
Comment 9 Daniel Quinlan 2005-04-06 11:51:08 UTC
Subject: Re:  Please add IP address host checking to urirhsbl and urirhssub

bugzilla-daemon@bugzilla.spamassassin.org writes:

> Reversing the dotted quads is traditional for RBLs AFIAK:

It is, but I'm not 100.0% sure all *RHS* blacklists reverse the quads
for IP listings.  It seems like the right thing to do, though, so I
documented it in the URIDNSBL.pm module as how we do it.

Daniel