Bug 6519 - RBL lookups for IPv6 addresses
Summary: RBL lookups for IPv6 addresses
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P2 enhancement
Target Milestone: 3.4.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-30 20:29 UTC by Mark Martinec
Modified: 2011-05-10 16:42 UTC (History)
4 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
proposed patch patch None Mark Martinec [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Martinec 2010-11-30 20:29:24 UTC
We have a bit of a "chicken or the egg" problem with DNS black/white-lists
and IPv6 addresses: SpamAssassin does not query them because most of RBL
lists presently do not yet enlist IPv6 addresses, and RBL operators do not
start listing them because nobody queries for them (and spam originating
from IPv6 hosts is still low).

Then there is an apparent problem of choosing a format for queries.
Luckily that is no longer the case since February 2010 when RFC 5782
was published: "DNS Blacklists and Whitelists".

Section 2.4 of RFC 5782 states:

2.4.  IPv6 DNSxLs

   The structure of DNSxLs based on IPv6 addresses is adapted from that
   of the IP6.ARPA domain defined in [RFC3596].  Each entry's name MUST
   be a 32-component hex nibble-reversed IPv6 address suffixed by the
   DNSxL domain.  The entries contain A and TXT records, interpreted the
   same way as they are in IPv4 DNSxLs.

   For example, to represent the address:

     2001:db8:1:2:3:4:567:89ab

   in the DNSxL ugly.example.com, the entry might be:

     b.a.9.8.7.6.5.0.4.0.0.0.3.0.0.0.2.0.0.0.1.0.0.0.8.b.d.0.1.0.0.2.
                  ugly.example.com. A 127.0.0.2
                                    TXT "Spam received."

   Combined IPv6 sublist DNSxLs are represented the same way as IPv4
   DNSxLs, replacing the four octets of IPv4 address with the 32 nibbles
   of IPv6 address.

   A single DNSxL could in principle contain both IPv4 and IPv6
   addresses, since the different lengths prevent any ambiguity.  If a
   DNSxL is represented using traditional zone files and wildcards,
   there is no way to specify the length of the name that a wildcard
   matches, so wildcard names would indeed be ambiguous for DNSxLs
   served in that fashion.


So, we have a format, and we know there is no conflict between the two
forms, so there is no reason not to start querying DNS RBLs for IPv6
addresses of incoming mail.

The attached patch is quite straightforward, I see no reason not to
start using it right away. If need be, we can adapt to whatever
common best practices emerge what or RBL list idiosyncrasies pop up.
Comment 1 Mark Martinec 2010-11-30 20:31:29 UTC
Created attachment 4829 [details]
proposed patch

trunk:
  Bug 6519: RBL lookups for IPv6 addresses
Sending lib/Mail/SpamAssassin/Plugin/DNSEval.pm
Committed revision 1040847.
Comment 2 Mark Martinec 2010-11-30 21:09:37 UTC
$ svn ci -m 'Bug 6519, missing detail'
Sending lib/Mail/SpamAssassin/Plugin/DNSEval.pm
Committed revision 1040858.
Comment 3 D. Stussy 2010-11-30 22:58:54 UTC
In the patch:  

  $revip = lc $ip_obj->network->full6;


The lc operation is not necessary, but it won't cause incorrect results leaving it in.  It should be removed for speed.  DNS labels are case insensitive, so we need not force lower case.
Comment 4 Mark Martinec 2010-12-01 06:04:53 UTC
> In the patch:  
>   $revip = lc $ip_obj->network->full6;
> The lc operation is not necessary, but it won't cause incorrect results
> leaving it in.  It should be removed for speed.  DNS labels are case
> insensitive, so we need not force lower case.

Well, it is true that DNS labels are case insensitive, and neither does
the RFC 4291 prefer one case over another. The reversed address still
shows up in the logs and the DNS query.

I chose the lowercase form because RFC 5782 uses lowercase in its
example, and because RFC 5952 (A Recommendation for IPv6 Address Text
Representation) recommends the lowercase form (SHOULD):

4.  A Recommendation for IPv6 Text Representation

   A recommendation for a canonical text representation format of IPv6
   addresses is presented in this section.  The recommendation in this
   document is one that complies fully with [RFC4291], is implemented by
   various operating systems, and is human friendly.  The recommendation
   in this section SHOULD be followed by systems when generating an
   address to be represented as text, but all implementations MUST
   accept and be able to handle any legitimate [RFC4291] format.  It is
   advised that humans also follow these recommendations when spelling
   an address.
[...]

4.3.  Lowercase

   The characters "a", "b", "c", "d", "e", and "f" in an IPv6 address
   MUST be represented in lowercase.

[...]
3.4.3.  Legibility

   Capital case D and 0 can be quite often misread.  Capital B and 8 can
   also be misread.



> It should be removed for speed.

Oh, c'mon! It's a single perl opcode - in contrast with thousands
of opcodes in the called subroutines on the same line.
Comment 5 Kevin A. McGrail 2010-12-01 09:11:18 UTC
> > It should be removed for speed.
> 
> Oh, c'mon! It's a single perl opcode - in contrast with thousands
> of opcodes in the called subroutines on the same line.

Having spent the last day or more dealing with a non-RFC compliant mailing issue, I am +1.000000000000 to leave the lc in their for safety sake.
Comment 6 D. Stussy 2010-12-02 01:52:23 UTC
I don't consider RFCs declaring the case of hexidecimal alpha-digits as controlling or authoritative at all.  In the history of Computer Science, especially the pre-1980 history, hexidecimal digits included only UPPER CASE letters, with their lower case equivalents being invalid.  Examples of such include the assembly languages for the IBM 360 class of machines (and its successors; 1960's) and even for microprocessors such as the Z80 or 6502 (popular in the late 1970's and early 80's).

The RFC is contrary to the historical representation and should not be followed.
Comment 7 Mark Martinec 2010-12-13 10:28:19 UTC
Bug 6519, use NetAddr::IP::full6 only when available,
without bumping up the minimal required version of NetAddr::IP
  Sending lib/Mail/SpamAssassin/AutoWhitelist.pm
  Sending lib/Mail/SpamAssassin/Plugin/DNSEval.pm
Committed revision 1045170.
Comment 8 Darxus 2011-04-10 18:42:32 UTC
Shouldn't this be closed as fixed?  When is a release with this capability expected?
(I just noticed it wasn't working with my reputation thing.)
Comment 9 D. Stussy 2011-04-10 19:12:44 UTC
Probably, since something has already been committed.  However, I still oppose the LC transformation (any any RFC that demands it) on historical grounds.  Since DNS is case insensitive, it just doesn't matter in the end result, but it does matter as to optimization and timing.  Doing many of these per second does eventually add up to a significant waste of time, especially when the DNS server or the resolver may do exactly the same thing (or worse, UC everything).  Therefore, as an UNNECESSARY operation, the lower-case transform should be deleted (at Plugin/DNSEval.pm, line 285).

I certainly hope that this isn't one of the issues holding up releasing SA 3.3.2, which need be issued to get rid of the perl 5.12 incompatibilities.

I don't have authority to change the status to closed.  Someone else has to do that.
Comment 10 Henrik Krohns 2011-04-11 01:38:25 UTC
+1 for using LC thus having Consistency. Previous point of wasting performance is just ridiculous.
Comment 11 Kevin A. McGrail 2011-04-11 12:50:42 UTC
(In reply to comment #10)
> +1 for using LC thus having Consistency. Previous point of wasting performance
> is just ridiculous.

I have to agree. +1 to lc the for just safety sake.  The performance increase would be infinitesimal and I've seen far too many non-rfc bugs introduced that safety valves like this can present real-world issues.

Closing this as resolved.
Comment 12 Darxus 2011-05-04 17:11:16 UTC
Can this go in 3.3.2?  Works for me in trunk.
Comment 13 Mark Martinec 2011-05-04 17:51:43 UTC
> Can this go in 3.3.2?  Works for me in trunk.

I think it's too much of a change: three commits here, plus
a fix for Bug 6573, and maybe something else IPv6 -related
that has already been resolved in trunk but have lost
track of all dependencies. Besides, users and the DNS WBL
providers may get surprised by an outspring of new queries
after a minor change of SA.
Comment 14 Warren Togami 2011-05-04 20:30:09 UTC
(In reply to comment #13)
> > Can this go in 3.3.2?  Works for me in trunk.
> 
> I think it's too much of a change: three commits here, plus
> a fix for Bug 6573, and maybe something else IPv6 -related
> that has already been resolved in trunk but have lost
> track of all dependencies. Besides, users and the DNS WBL
> providers may get surprised by an outspring of new queries
> after a minor change of SA.


Furthermore, the very concept of IPv6 DNSBL's is broken, so having this capability in 3.3.2 is not very useful.
Comment 15 Darxus 2011-05-10 16:42:27 UTC
(In reply to comment #14)
> Furthermore, the very concept of IPv6 DNSBL's is broken, so having this
> capability in 3.3.2 is not very useful.

I've failed to leave this alone.  

If you aggregate your DNSBL to something like /48s, the only remaining problem is caching DNS servers not knowing they can cache at the /48.  It's not perfect, but I think it's usable.