Bug 6087 - DKIM plugin support for domain signing practices (ADSP), with overrides
Summary: DKIM plugin support for domain signing practices (ADSP), with overrides
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P5 enhancement
Target Milestone: 3.3.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-19 11:43 UTC by Mark Martinec
Modified: 2009-08-04 18:03 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Martinec 2009-03-19 11:43:24 UTC
It all started as a small enhancement to a DKIM plugin to be able to take
advantage of already checked signatures elsewhere (subject to another problem
report), but I got carried away with Spring cleaning of code (like renaming
variable $scan to $pms, along with renaming some other internal objects to
better reflect the current shape of terminology and style).

...But I ended up with reshaping terminology from a DomainKeys term POLICY,
to what used to be a SSP (Sender Signing Practices), and is now known as
ADSP, Author Domain Signing Practices (draft-ietf-dkim-ssp-09). One thing
lead to another and the major new feature of these changes turned out to be
a way to manually override the ADSP (which is typically still unpublished
nowadays). This allows for more comfortably penalizing forged mail claiming
to be from domains like ebay.com, paypal.com, but is also (to a lesser degree)
useful for domains like yahoo.com, gmail.com, etc. A new eval rule for
fetching ADSP is also replacing former ones.

The DKIM is fully compatible with existing 3.3 code and rules, and I also
made it compatible with 3.2.5, in case someone wants to use it there.

When examining the code, please do not bother to check the diffs, as it
is large, partly due to indentation changes, shifting of code sections,
variable renames and comment updates. Just examine the plugin itself.

I'll let my POD docs from the plugin take it from here:


full   DKIM_SIGNED           eval:check_dkim_signed()
full   DKIM_VALID            eval:check_dkim_valid()
full   DKIM_VALID_AU         eval:check_dkim_valid_author_sig()

header DKIM_ADSP_NXDOMAIN    eval:check_dkim_adsp('N')
header DKIM_ADSP_ALL         eval:check_dkim_adsp('A')
header DKIM_ADSP_DISCARD     eval:check_dkim_adsp('D')
header DKIM_ADSP_CUSTOM_LOW  eval:check_dkim_adsp('1')
header DKIM_ADSP_CUSTOM_MED  eval:check_dkim_adsp('2')
header DKIM_ADSP_CUSTOM_HIGH eval:check_dkim_adsp('3')

describe DKIM_SIGNED    Message has a DKIM or DK signature, not necessarily valid
describe DKIM_VALID     Message has at least one valid DKIM or DK signature
describe DKIM_VALID_AU  Message has a valid DKIM or DK signature from author's domain

describe DKIM_ADSP_NXDOMAIN    No valid author signature and domain not in DNS
describe DKIM_ADSP_ALL         No valid author signature, domain signs all mail
describe DKIM_ADSP_DISCARD     No valid author signature, domain signs all mail
                                 and suggests unsigned mail be discarded
describe DKIM_ADSP_CUSTOM_LOW  No valid author signature, adsp_override is CUSTOM_LOW
describe DKIM_ADSP_CUSTOM_MED  No valid author signature, adsp_override is CUSTOM_MED
describe DKIM_ADSP_CUSTOM_HIGH No valid author signature, adsp_override is CUSTOM_HIGH

For compatibility, the following are synonyms:
 OLD: eval:check_dkim_verified = NEW: eval:check_dkim_valid
 OLD: eval:check_dkim_signall  = NEW: eval:check_dkim_adsp('A')
 OLD: eval:check_dkim_signsome = NEW: redundant, semantically always true

[...]

=item adsp_override domain [signing_practices]

Currently few domains publish their signing practices (draft-ietf-dkim-ssp,
ADSP), partly because the ADSP draft/rfc is rather new, partly because they
think hardly any recipient bothers to check it, and partly for fear that
some recipients might lose mail due to problems in their signature validation
procedures or mail mangling by mailers beyond their control.

Nevertheless, recipients could benefit by knowing signing practices of a
sending (author's) domain, for example to recognize forged mail claiming
to be from certain domains which are popular targets for phishing, like
financial institutions. Unfortunately, as signing practices are seldom
published or are weak, it is hardly justifiable to look them up in DNS.

To overcome this chicken-and-egg problem, the C<adsp_override> mechanism
allows recipients using SpamAssassin to override published or defaulted
ADSP for certain domains. This makes it possible to manually specify a
stronger (or weaker) signing practices than a signing domain is willing
to publish (explicitly or by default), and also save on a DNS lookup.

Note that ADSP (published or overridden) is only consulted for messages
which do not contain a valid DKIM signature from the author's domain.

According to ADSP draft, signing practices can be one of the following:
C<unknown>, C<all> and C<discardable>.

C<unknown>: Messages from this domain might or might not have an author
signature. This is a default if a domain exists in DNS but no ADSP record
is found.

C<all>: All messages from this domain are signed with an Author Signature.

C<discardable>: All messages from this domain are signed with an Author
Signature. If a message arrives without a valid Author Signature, the domain
encourages the recipient(s) to discard it.

ADSP lookup can also determine that a domain is "out of scope", i.e., the
domain does not exist (NXDOMAIN) in the DNS.

To override domain's signing practices in a SpamAssassin configuration file,
specify an C<adsp_override> directive for each sending domain to be overridden.

Its first argument is a domain name. Author's domain is matched against it,
matching is case insensitive. This is not a regular expression or a file-glob
style wildcard, but limited wildcarding is still available: if this arguments
starts by a "*." (or is a sole "*"), author's domain matches if it is a
subdomain (to one or more levels) of the argument. Otherwise (with no leading
asterisk) the match must be exact (not a subdomain).

An optional second parameter is one of the following keywords
(case-insensitive): C<nxdomain>, C<unknown>, C<all>, C<discardable>,
C<custom_low>, C<custom_med>, C<custom_high>.

Absence of this second parameter implies C<discardable>. If a domain is not
listed by a C<adsp_override> directive nor does it explicitly publish any
ADSP record, then C<unknown> is implied for valid domains, and C<nxdomain>
for domains not existing in DNS. (Note: domain validity may not be checked
with current versions of Mail::DKIM, so C<nxdomain> may never turn up)

The strong setting C<discardable> is useful for domains which are known
to always sign their mail and to always send it directly to recipients
(not to mailing lists), and are frequent targets of fishing attempts,
such as financial institutions. The C<discardable> is also appropriate
for domains which are known never to send any mail.

When a message does not contain a valid signature by the author's domain
(the domain in a From header field), the signing practices pertaining
to author's domain determine which of the following rules fire and
contributes its score: DKIM_ADSP_NXDOMAIN, DKIM_ADSP_ALL, DKIM_ADSP_DISCARD,
DKIM_ADSP_CUSTOM_LOW, DKIM_ADSP_CUSTOM_MED, DKIM_ADSP_CUSTOM_HIGH. Not more
than one of these rules can fire. The last three can only result from a
'signing_practices' as given in a C<adsp_override> directive (not from a
DNS lookup), and can serve as a convenient means of providing a different
score if scores assigned to DKIM_ADSP_ALL or DKIM_ADSP_DISCARD are not
considered suitable for some domains.

Example:

  adsp_override *.mydomain.example.com   discardable
  adsp_override *.neversends.example.com discardable

  adsp_override ebay.com       discardable
  adsp_override *.ebay.com     discardable
  adsp_override ebay.co.uk     discardable
  adsp_override *.ebay.co.uk   discardable
  adsp_override paypal.com     discardable
  adsp_override *.paypal.com   discardable
  adsp_override amazon.com     discardable
  adsp_override alert.bankofamerica.com discardable

  adsp_override google.com     all
  adsp_override gmail.com      all
  adsp_override googlemail.com all
  adsp_override yahoo.com      all
  adsp_override yahoo.com.au   custom_low
  adsp_override yahoo.se       custom_low
  adsp_override youtube.com    custom_high
  adsp_override skype.net      custom_high

  adsp_override junkmailerkbw0rr.com nxdomain
  adsp_override junkmailerd2hlsg.com nxdomain

  # effectively disables ADSP network DNS lookups for all other domains:
  adsp_override *              unknown

  score DKIM_ADSP_ALL          1.5
  score DKIM_ADSP_DISCARD     25
  score DKIM_ADSP_NXDOMAIN     3

  score DKIM_ADSP_CUSTOM_LOW   1
  score DKIM_ADSP_CUSTOM_MED   3.5
  score DKIM_ADSP_CUSTOM_HIGH  8
Comment 1 Mark Martinec 2009-03-19 11:54:07 UTC
> DKIM plugin support for domain signing practices (ADSP),
> with overrides. Implements an 'adsp_override' config file directive,
> adds eval:check_dkim_adsp() used for rules DKIM_ADSP_*.  Also
> allows this plugin to re-use Mail::DKIM verification results
> if made available by a caller or elsewhere.
Sending        lib/Mail/SpamAssassin/Plugin/DKIM.pm
Transmitting file data .
Committed revision 756131.
Comment 2 Mark Martinec 2009-03-19 12:00:00 UTC
Btw, I have now the following in my local.cf. Scores subject to change.
The __VIA_ML rule below ('mail came trough a mailing list') is from
my rulesrc/sandbox/mmartinec/25_yg.cf . The rest of rules in 25_yg.cf
can now go away, replaced by a cleaner 'adsp_override'.



header DKIM_ADSP_NXDOMAIN    eval:check_dkim_adsp('N')
header DKIM_ADSP_ALL         eval:check_dkim_adsp('A')
header DKIM_ADSP_DISCARD     eval:check_dkim_adsp('D')
header DKIM_ADSP_CUSTOM_LOW  eval:check_dkim_adsp('1')
header DKIM_ADSP_CUSTOM_MED  eval:check_dkim_adsp('2')
header DKIM_ADSP_CUSTOM_HIGH eval:check_dkim_adsp('3')

adsp_override ebay.com       discardable
adsp_override *.ebay.com     discardable
adsp_override ebay.at        discardable
adsp_override ebay.be        discardable
adsp_override ebay.ca        discardable
adsp_override ebay.ch        discardable
adsp_override ebay.de        discardable
adsp_override ebay.ee        discardable
adsp_override ebay.es        discardable
adsp_override ebay.fr        discardable
adsp_override ebay.hu        discardable
adsp_override ebay.ie        discardable
adsp_override ebay.in        discardable
adsp_override ebay.it        discardable
adsp_override ebay.nl        discardable
adsp_override ebay.ph        discardable
adsp_override ebay.pl        discardable
adsp_override ebay.pt        discardable
adsp_override ebay.se        discardable
adsp_override ebay.co.kr     discardable
adsp_override ebay.co.uk     discardable
adsp_override ebay.com.au    discardable
adsp_override ebay.com.cn    discardable
adsp_override ebay.com.hk    discardable
adsp_override ebay.com.mx    discardable
adsp_override ebay.com.my    discardable
adsp_override ebay.com.sq    discardable

adsp_override paypal.com     discardable
adsp_override *.paypal.com   discardable
adsp_override paypal.co.uk   discardable
adsp_override amazon.com     discardable
adsp_override alert.bankofamerica.com discardable

adsp_override google.com     all
adsp_override gmail.com      all
adsp_override googlemail.com all

adsp_override yahoo.com      custom_low
adsp_override yahoo.com.ar   custom_low
adsp_override yahoo.com.au   custom_low
adsp_override yahoo.com.br   custom_low
adsp_override yahoo.com.cn   custom_low
adsp_override yahoo.com.hk   custom_low
adsp_override yahoo.com.mx   custom_low
adsp_override yahoo.com.my   custom_low
adsp_override yahoo.com.ph   custom_low
adsp_override yahoo.com.sg   custom_low
adsp_override yahoo.com.tw   custom_low
adsp_override yahoo.co.id    custom_low
adsp_override yahoo.co.in    custom_low
adsp_override yahoo.co.jp    custom_low
adsp_override yahoo.co.nz    custom_low
adsp_override yahoo.co.th    custom_low
adsp_override yahoo.co.uk    custom_low
adsp_override yahoo.ca       custom_low
adsp_override yahoo.cn       custom_low
adsp_override yahoo.de       custom_low
adsp_override yahoo.dk       custom_low
adsp_override yahoo.es       custom_low
adsp_override yahoo.fr       custom_low
adsp_override yahoo.gr       custom_low
adsp_override yahoo.ie       custom_low
adsp_override yahoo.it       custom_low
adsp_override yahoo.no       custom_low
adsp_override yahoo.pl       custom_low
adsp_override yahoo.se       custom_low

adsp_override youtube.com    custom_high
adsp_override skype.net      custom_high
adsp_override ag.com         custom_low
adsp_override americangreetings.com discardable

adsp_override junkmailerkbw0rr.com nxdomain
adsp_override junkmailerd2hlsg.com nxdomain

# effectively disables ADSP network DNS lookups for all other domains
# adsp_override *            unknown

score DKIM_ADSP_ALL          1.5
score DKIM_ADSP_CUSTOM_LOW   0.01
score DKIM_ADSP_CUSTOM_MED   0.01
score DKIM_ADSP_CUSTOM_HIGH  0.01
score DKIM_ADSP_DISCARD      8

meta  L_ADSP_CUSTOM_LOW   DKIM_ADSP_CUSTOM_LOW  && !__VIA_ML
score L_ADSP_CUSTOM_LOW   1.5
meta  L_ADSP_CUSTOM_MED   DKIM_ADSP_CUSTOM_MED  && !__VIA_ML
score L_ADSP_CUSTOM_MED   3.5
meta  L_ADSP_CUSTOM_HIGH  DKIM_ADSP_CUSTOM_HIGH && !__VIA_ML
score L_ADSP_CUSTOM_HIGH  6
Comment 3 Justin Mason 2009-03-19 12:54:31 UTC
have you got any mass-check results from these?  

if this allows us to safely verify ebay.* and paypal.*, that'd be excellent ;)
Comment 4 Mark Martinec 2009-03-19 13:37:45 UTC
> have you got any mass-check results from these?

No mass-check, but I do have large logs, listing all rule hits of all
our processed mail, and my 25_yg.cf was in use, so I do have all the
necessary data for all domains mentioned there, i.e. gmail.com, yahoo,
ebay and paypal, along with their national variants. The new mechanism
is functionally equivalent to rules in 25_yg.cf for these domains.
I'll prepare some stats.

> if this allows us to safely verify ebay.* and paypal.*, that'd be excellent ;)

Yes, that's what I had in mind. I know this has been discussed before
(can someone find a reference?). It is probably equivalent to the
arrangement PayPal supposedly did with Yahoo. The initial list of
adsp_override directives could be included in the distribution,
just like def_whitelist_* .
 

Comment 5 Mark Martinec 2009-03-20 12:03:00 UTC
> No mass-check, but I do have large logs, listing all rule hits of all
> our processed mail, and my 25_yg.cf was in use, so I do have all the
> necessary data for domains mentioned there [...] I'll prepare some stats

I converted logs of the past 12 weeks to a mass-check format. As classification
was automatic, I ditched everything with scores between 2 and 6.2, to reduce
the likelyhood of false classification, then I pronounced low score mail as ham
and high score spam. The hit-frequencies has the following to say on rules
dealing with domains in question:

OVERALL%   SPAM%     HAM%     S/O    RANK  SCORE  NAME
4058896  3465768   593128    0.854   0.00   0.00  (all messages)

just rules checking on a DKIM/DK signature:
 111049   110361      688    0.965   0.68   0.00  NOTVALID_YAHOO
    363      363        0    1.000   0.61   0.00  NOTVALID_PAYPAL
    274      274        0    1.000   0.60   0.00  NOTVALID_EBAY
  41673    39778     1895    0.782   0.55   0.00  NOTVALID_GMAIL

remaining related rules:
  18519    18519        0    1.000   0.92   0.00  MSGID_YAHOO_CAPS
  15476    15476        0    1.000   0.91   0.00  FORGED_MSGID_YAHOO
  39942    39795      147    0.979   0.79   0.00  REPTO_QUOTE_YAHOO
  61157    60927      230    0.978   0.76   0.00  FORGED_YAHOO_RCVD
    881      879        2    0.987   0.69   0.00  SARE_EBAY_SPOOF_NAME
    304      304        0    1.000   0.60   0.00  SARE_FORGED_PAYPAL_C
    234      234        0    1.000   0.58   0.00  SARE_FORGED_PAYPAL
     32       32        0    1.000   0.51   0.00  ZMIde_EBAYJOBSURI
     21       21        0    1.000   0.50   0.00  SARE_FORGED_EBAY

The NOTVALID_PAYPAL and NOTVALID_EBAY rules check all mail claiming to be
from these domains, while NOTVALID_YAHOO and NOTVALID_GMAIL ignore mail
which appears to be coming through a mailing list (see 25_yg.cf for details).

So the NOTVALID_PAYPAL+NOTVALID_EBAY ditched 637 scams and did not hit on
any ham. I double-checked the figure on the entire corpus (including mail
with scores between 2 and 6.2), so I know I didn't miss any case due to
auto-classification. Of course many of these scams would score high enough
on other rules too, so checking on DKIM signature in case of eBay and PayPal
is just an additional safety fuse.

The NOTVALID_YAHOO and NOTVALID_GMAIL are another story, they hit on many ham
messages too. Nevertheless, I find it valuable to assign 2.5 score points
to each.

According to my stats in Bug 5891, the average score from signed vs. unsigned
mail from yahoo and gmail is very different. In other words, spammers claiming
to be from yahoo or gmail rarely post their junk through the domain's server,
while many or the regular users do:

  yahoo.com  not signed,  avg.score= 14.8
  yahoo.com  valid sign., avg.score= -0.7

  gmail.com  not signed,  avg.score=  2.9
  gmail.com  valid sign., avg.score= -3.3
Comment 6 Justin Mason 2009-03-21 14:56:38 UTC
(In reply to comment #5)
> just rules checking on a DKIM/DK signature:
>  111049   110361      688    0.965   0.68   0.00  NOTVALID_YAHOO
>     363      363        0    1.000   0.61   0.00  NOTVALID_PAYPAL
>     274      274        0    1.000   0.60   0.00  NOTVALID_EBAY
>   41673    39778     1895    0.782   0.55   0.00  NOTVALID_GMAIL

 NOTVALID_PAYPAL and  NOTVALID_EBAY look fantastic!  the other two I guess would make good meta-rule fodder at least.

+1 for checking those rules in (if you like).
Comment 7 Mark Martinec 2009-03-23 08:52:33 UTC
Some hardening - preventing ADSP false positives in case of DNS failures:

| DKIM plugin: prevent ADSP rules from firing if DKIM signatures could
| not be verified due to DNS resolver not being available, or Mail::DKIM
| modules not available, or temporary DNS failures when retrieving a
| public key from the author's domain (this one still needs a better
| support from Mail::DKIM, I'll contact the author). Plus some rather
| cosmetic changes.
Sending lib/Mail/SpamAssassin/Plugin/DKIM.pm
Committed revision 757417.
Comment 8 Mark Martinec 2009-04-03 08:45:51 UTC
  DKIM plugin: do not trigger ADSP rules when there is a known
  likely reason of author's domain signature failure, such as a
  DNS problem or a truncated message being passed to SpamAssassin.
Sending        lib/Mail/SpamAssassin/Plugin/DKIM.pm
Committed revision 761708.

I added the following to the POD:

As a precaution against firing DKIM_ADSP_* rules when there is a known local
reason for a signature verification failure, the domain's ADSP is considered
'unknown' when DNS lookups are disabled or a DNS lookup encountered a temporary
problem on fetching a public key from the author's domain. Similarly, ADSP
is considered 'unknown' when this plugin did its own signature verification
(signatures were not passed to SA by a caller) and a metarule __TRUNCATED was
triggered, indicating the caller intentionally passed a truncated message to
SpamAssassin, which was a likely reason for a signature verification failure.


And hereby I declare the name of a rule '__TRUNCATED' as 'taken'.

If a caller of spamc or spamassassin or whatever other sw encounters a long
message (e.g. beyond -s max_size) but wishes to pass at least some part of
it to SpamAssassin (spam messages are getting larger!), it should ensure
that a __TRUNCATED rule gets a hit, so that a DKIM plugin takes a signature
failure and a subsequent ADSP enforcement lightly. One possibility is to
prepend some dedicated message header and add a rule like:

header __TRUNCATED X-Amavis-MessageSize =~ m{\A[^\n]*TRUNCATED}m

Another possibility is when $spamassassin_obj->parse is called directly
(such as by spamd), it can pass a rule hit of a __TRUNCATED rule
through the new %suppl_attrib argument (see Bug 6088).

Comment 9 Mark Martinec 2009-04-09 10:07:03 UTC
(In reply to comment #6)
> NOTVALID_PAYPAL and  NOTVALID_EBAY look fantastic!
> the other two I guess would make good meta-rule fodder at least.
> 
> +1 for checking those rules in (if you like).

One step closer to the above goal:

  DKIM plugin: add eval rule __DKIM_DEPENDABLE, which can be
  consulted to prevent false positives on large but truncated
  messages with poor man's implementation of ADSP by hand-crafted
  rules.
Sending lib/Mail/SpamAssassin/Plugin/DKIM.pm
Committed revision 763733.


From the POD:

 full     __DKIM_DEPENDABLE  eval:check_dkim_dependable()
 describe __DKIM_DEPENDABLE  A validation failure not attributable to truncation

The __DKIM_DEPENDABLE eval rule deserves an explanation. The rule yields true
when signatures are supplied by a caller, OR ELSE when signatures are obtained
by this plugin AND either there are no signatures OR a rule __TRUNCATED was
false. In other words: __DKIM_DEPENDABLE is true when failed signatures can
not be attributed to message truncation when feeding a message to SpamAssassin.
It can be consulted to prevent false positives on large but truncated messages
with poor man's implementation of ADSP by hand-crafted rules.
Comment 10 Mark Martinec 2009-08-04 18:03:24 UTC
Closing, works fine.