Bug 5054

Summary: received-header parsing should not perform DNS resolution
Product: Spamassassin Reporter: Justin Mason <jm>
Component: LibrariesAssignee: SpamAssassin Developer Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P5    
Version: SVN Trunk (Latest Devel Version)   
Target Milestone: Undefined   
Hardware: Other   
OS: other   
Whiteboard:

Description Justin Mason 2006-08-17 12:51:22 UTC
currently, we have a few situations in Message/Metadata/Received.pm where DNS
name resolution is used -- I suggest that we should get rid of this.  Reasons:

- (a) it's slow
- (b) it's the wrong place to do that; header-parsing should not incur network
  accesses
- (c) it results in differing results, depending on if local_tests_only is on
  or not.

The places are:

  5410     jmason     if ($in_trusted && !$did_user_specify_trust) {
  5410     jmason       my $inferred_as_trusted = 0;
  5410     jmason
  5410     jmason       # do we know what the IP addresses of the "by" host in
the first
  5410     jmason       # header is?  If not, set them from this header, since
it's the
  5410     jmason       # first one.  NOTE: this is a ref to an array, NOT a string.
 56642   felicity       if (!defined $first_by && $self->{is_dns_available}) {
  5410     jmason       $first_by = [ $self->lookup_all_ips ($relay->{by}) ];
  5410     jmason       }
                        ....
  5705     jmason       # can we use DNS?  If not, we cannot use this algorithm,
as we
  5705     jmason       # cannot lookup hostnames. :(
  5705     jmason       # Consider the first relay trusted, and all others
untrusted.
 56642   felicity       if (!$self->{is_dns_available}) {
 51813    quinlan       dbg("received-header: cannot use DNS, do not trust any
hosts from here on");
  5705     jmason       }
                        ....
  5565     jmason       # if *all* of the IP addrs for the 'by' host are in a
reserved net range,
  5565     jmason       # it's not on the public internet.  Note that we should
still stop if
  5565     jmason       # only *some* of the IPs are reserved; this can happen
for multi-homed
  5565     jmason       # gateway hosts.  For example
  5565     jmason       #
  5565     jmason       #   PRIVATE NET    A          B    INTERNET
  5565     jmason       #     scanner <---> gateway_MX <---> internet
  5565     jmason       #
  5565     jmason       # Interface A would be on a reserved net, but B would
have a "public" IP
  5565     jmason       # address.  Same can happen if the scanner runs on the
gateway-MX, since
  5565     jmason       # lookup_all_ips() will return [ public_IP_addr,
127.0.0.1 ] as the list
  5565     jmason       # of addresses, and 127.0.0.1 is a "reserved" address.
(bug 2113)
  5565     jmason
  5410     jmason       else {
  5410     jmason       my @ips = $self->lookup_all_ips ($relay->{by});

in other words, removal of most of the trust-path inference algorithm, except
for 'if the 'from' IP addr is in a reserved net range, it's not on the public
internet' and 'if we find authentication tokens in the received header we can
extend the trust boundary to that host'.

I think we should do this, since right now, the algorithm gives different
results based on the presence of -L or not, and frequently doesn't infer the
boundaries correctly according to bug reports/ FAQ use.  That's confusing;
let's just simplify things, and if people have complex nets, let them
set trusted_networks (which they do anyway!)


Next:

  4953     jmason   # perform rDNS check if MTA has not done it for us.
  5452     jmason   #
  4953     jmason   # TODO: do this for untrusted headers anyway; if it
mismatches it
  4953     jmason   # could be a spamsign.  Probably better done later after we've
  4953     jmason   # moved the "trusted" ones out of the way.  In fact, this op
  4953     jmason   # here may be movable too; no need to lookup trusted IPs all
the time.
  5452     jmason   #
  4953     jmason   if ($rdns eq '') {
 56642   felicity     if (!$self->{is_dns_available}) {
  5452     jmason       if ($mta_looked_up_dns) {
  5452     jmason       # we know the MTA always does lookups, so this means the
host
  5452     jmason       # really has no rDNS (rather than that the MTA didn't bother
  5452     jmason       # looking it up for us).
  5452     jmason       $relay->{no_reverse_dns} = 1;
  9337         jm       $rdns = '';
  5452     jmason       } else {
  5452     jmason       $relay->{rdns_not_in_headers} = 1;
  5452     jmason       }
  5452     jmason
  4978     jmason     } else {
  6771         jm       $rdns = $self->{dns_pms}->lookup_ptr ($ip);
  5416     jmason
  5416     jmason       if (!$rdns) {
  5416     jmason       $relay->{no_reverse_dns} = 1;
  9337         jm       $rdns = '';
  5416     jmason       }
  4978     jmason     }
  4953     jmason   }

A simpler case; we can move that lookup out to the eval test that uses it, pretty
easily.

Comments?
Comment 1 Justin Mason 2006-08-21 15:14:40 UTC
no comment sounds like general agreement, then ;)
Comment 2 Theo Van Dinter 2006-08-21 15:18:27 UTC
(In reply to comment #1)
> no comment sounds like general agreement, then ;)

Sure, have at it. :)
Comment 3 Justin Mason 2006-08-21 23:00:25 UTC
 Received.pm |  124 ++++--------------------------------------------------------
 1 file changed, 9 insertions(+), 115 deletions(-)

oh yeah ;)   checked in as r433413.

I don't think we should backport to 3.1.x btw; it'd be a change of behaviour
midway through the release cycle...