Bug 5684 - patch to SPF.pm to add detection of "+all" in spf records
patch to SPF.pm to add detection of "+all" in spf records
Status: NEW
Product: Spamassassin
Classification: Unclassified
Component: Plugins
3.2.3
PC Linux
: P5 enhancement
: Undefined
Assigned To: SpamAssassin Developer Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2007-10-13 15:00 UTC by Andrew Kinnard
Modified: 2010-11-17 11:42 UTC (History)
4 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
The patch (beta qual) on SA 3.2.3 ver of SPF.pm patch None Andrew Kinnard [NoCLA]
The ruleset for the patch text/plain None Andrew Kinnard [NoCLA]
Patch update for minor bug fixes patch None Andrew Kinnard [NoCLA]
Alternative to proposed patch for SA - a MIMEDefang solution. text/plain None D. Stussy [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Kinnard 2007-10-13 15:00:11 UTC
A patch that adds two new tests (one each for 'helo' and 'mail from:' lookups)
to detect "+all" in the unparsed spf v1 record.  Patch tested only on Linux and
Perl 5.8.8.
Comment 1 Andrew Kinnard 2007-10-13 15:03:02 UTC
Created attachment 4157 [details]
The patch (beta qual) on SA 3.2.3 ver of SPF.pm

Patch introduces no changes to processing unless the associated rules are
active; all existing optimizations (like header checks for trusted relay's SPF
results) are maintained.  Patch preserves existing logic to choose between
Mail::SPF and Mail::SPF::Query (with a preference for the former).
Comment 2 Andrew Kinnard 2007-10-13 15:04:20 UTC
Created attachment 4158 [details]
The ruleset for the patch
Comment 3 Andrew Kinnard 2007-10-15 11:52:25 UTC
(In reply to comment #0)
> A patch that adds two new tests (one each for 'helo' and 'mail from:' lookups)
> to detect "+all" in the unparsed spf record.  Patch tested only on Linux and
> Perl 5.8.8.

Comment 4 Andrew Kinnard 2007-10-17 14:05:10 UTC
(In reply to comment #1)
An update to the patch for minor bug fixes.
Comment 5 Andrew Kinnard 2007-10-17 14:07:56 UTC
Created attachment 4165 [details]
Patch update for minor bug fixes

Fixes error r/t undefined var in debug statement.  Fixes debug error with
$spf_record when $spf_record_object is undefined (and Mail::SPF is present).
Comment 6 Andrew Kinnard 2007-10-20 13:50:41 UTC
I wish to thank Gareth Blades of the Linguaphone Group for his assistance in
coding the Perl for this patch.
Comment 7 Matt Kettler 2007-10-20 16:38:49 UTC
Trivial workaround to continue to have a "+all" record without matching your rule:

ip4:0.0.0.0/1 ip4:128.0.0.0/1

More to the point, is it really a problem if someone does have a +all record? It
doesn't really help them at all from a SpamAssassin perspective..

Also, have you seen any +all records being used in the wild? Have they been
causing you trouble somehow? Or is this just a pre-emptive patch?
Comment 8 Theo Van Dinter 2007-10-20 17:04:42 UTC
I also have to agree that I'm not really sure what the point is.  For something
like this though, I'd be tempted to have the plugin make a psuedo header w/ the
SPF text.  Then normal rules could be run against it w/out having to modify the
plugin for every little thing.

:)
Comment 9 Matt Kettler 2007-10-20 19:21:36 UTC
Agreed, adding a pseudo-header would allow rule writers to do nearly anything
they wanted.

Of course, you'd probably need plugin code to detect all possible variants of
the ip4 ranges bypass. A plugin could bit-mask all the ip-4 blocks together and
see if it added up to 0/0. But that seems like a lot of work for something that
isn't likely to cause a problem. (unless someone does something foolish like
assign -5.0 to SPF_PASS.. but if it hurts when you shoot yourself in the foot,
don't get steel toed boots, stop shooting.)

Comment 10 Andrew Kinnard 2007-10-23 13:38:19 UTC
Matt, you and Theo are correct in that this is easily worked around.  I was
asked to work up something to catch this subversion of spf's intent, and this is
what I came up with.  At the time, about 3 weeks ago, Gareth was getting a lot
of spam from domains with "+all" records.  The intent was not to compensate for
reckless negative scoring for of spf pass but to use the "+all" record as a
clear indication that the domain owner intended to spam from that domain (using
any machine).  In the first few days of testing, I was astounded at the number
of domains it found.  Since then, the flow of 

I'd LOVE to see SPF.pm create a pseudo header containing the spf record (and
sender-id record), but that was way beyond my nascent abilities in Perl; so, I
didn't even attempt it.
Comment 11 John Hardin 2010-07-05 15:34:55 UTC
Actual example of abuse in the wild:

$ dig txt nationwide.co.uk

 ;; ANSWER SECTION:
nationwide.co.uk.       5648    IN      TXT     "v=spf1 mx a:mailhost.nationet.com a:mailhost2.nationet.com include:messagelabs.com ~all"

$ dig txt messagelabs.com

 ;; ANSWER SECTION:
messagelabs.com.        84771   IN      TXT     "v=spf1 +all"


As long as this is the case SPF cannot be used to filter Nationwide Bank phishing, and a whitelist_auth/whitelist_spf against nationwide.co.uk doesn't do what you want.

It would probably be a good idea to generate a lint warning on whitelist_auth or whitelist_spf where +all is present.

Do we want to revisit this for 3.3.x/3.4.x?
Comment 12 Karsten Bräckelmann 2010-07-05 16:04:22 UTC
(In reply to comment #11)
> Actual example of abuse in the wild:

Not abuse. Mis-configuration.

> As long as this is the case SPF cannot be used to filter Nationwide Bank
> phishing, and a whitelist_auth/whitelist_spf against nationwide.co.uk doesn't
> do what you want.

Nationwide and/or Messagelabs need to fix it. Not us.

> It would probably be a good idea to generate a lint warning on whitelist_auth
> or whitelist_spf where +all is present.

That's not what lint'ing is for...

Anyway, glimpsing over the old comments, I too agree that adding the SPF info as a Template Tag (and thus possibly header) would fix this issue quite nicely. Which doesn't make it any less of a mis-configuration on their side...
Comment 13 Kevin A. McGrail 2010-07-05 16:09:15 UTC
> Not abuse. Mis-configuration.

I view this as a misconfiguration, as well.  It's nice to point out errors when we see them, but this would be a low priority from my view.

KAM
Comment 14 Karsten Bräckelmann 2010-07-05 16:19:04 UTC
(In reply to comment #12)
> Mis-configuration.
> 
> > As long as this is the case SPF cannot be used to filter Nationwide Bank
> > phishing, and a whitelist_auth/whitelist_spf against nationwide.co.uk doesn't
> > do what you want.
> 
> Nationwide and/or Messagelabs need to fix it. Not us.

Confirmed on the dev list by Matt Sergeant. Nationwide is supposed to include spf.messagelabs.com, not messagelabs.com.

A simple mis-configuration.
Comment 15 D. Stussy 2010-11-15 16:27:59 UTC
I am of the opinion that this is a poor approach.  The condition should not be whether the SPF record contains a "+all" but that such a mechanism is the reason why a "pass" result is generated.

Consider:  "v=spf1 a mx all".

Clearly, if the "a" or "mx" mechanism is the reason for the passing result, there is nothing "funny" going on here, despite having a "+all" in the record.

Also, SPF is better done at the SMTP "MAIL FROM" state, not the "DATA"/eom state where the message has already been transferred.  SPF failures should be rejected earlier in SMTP.  However, SPF "none"* vs. "neutral" vs. "softfail" are still results for SA scoring.  (* - see bug 6490 for "none" being available to SA.)

RE - Comment #7:  I agree, and I also disallow by policy any passing result that came from a CIDR'ed mechanism where the mask is less than 8.  I chose "/8" because: 1)  It is the largest allocation to any single organization, and 2) it divides the IPv4 address space into 256 blocks - and for a spammer to cover enough of it to be worthwhile, he would violate SPF's mechanism limit and/or the DNS size limit on SPF or TXT RRs, even with nested included SPF records.

I call SA from MIMEDefang, so I perform my primary SPF check in MD's hook filter_sender() routine.  When it calls Mail::SPF, one of the returned fields is an explanation of the result.  When pass results, the explanation is usually "Mechanism '*' matches", and I parse the "*" for a CIDR or "all". Such "wide" CIDR-matching pass results are changed to the "policy" result per RFC 5451.

Such action makes this patch completely unnecessary, plus it targets the wrong criterion.
Comment 16 John Wilcock 2010-11-16 02:40:49 UTC
(In reply to comment #15)
> I am of the opinion that this is a poor approach.  The condition should not be
> whether the SPF record contains a "+all" but that such a mechanism is the
> reason why a "pass" result is generated.
> 
> Consider:  "v=spf1 a mx all".
> 
> Clearly, if the "a" or "mx" mechanism is the reason for the passing result,
> there is nothing "funny" going on here, despite having a "+all" in the record.

The mere fact of having a +all, whether or not it is the mechanism used to pass SPF, could be a spam sign anyway. I could imagine it being an indicator similar in impact to, say, FREEMAIL_FROM, and useful to SA at least as a component of metarules. But this is just a postulate; I've no idea whether current usage bears this out. 

Passing SPF only by virtue of a +all (or similarly all-inclusive pattern) could be another spam sign; I'd expect, again without having checked the actual data, that it would be a more discriminant spam sign. 

Adding the SPF data in a pseudo-header would be a nice first step towards collecting some real world statistics here...
Comment 17 D. Stussy 2010-11-16 13:49:29 UTC
Having "+all" can also be a sign of 1) A lazy admin., or 2) one who really doesn't care*.  Regardless, SPF is not a spam solution but a forgery solution.  Granted, much spam is also forged, but so can non-spam be forged.  Deferring action to the data phase where SA gets the message is too late in the SMTP transaction.

* - Especially if another anti-forgery method (DK/DKIM or PGP signatures) is always deployed in messages from that source, there should be no such conclusion, implied or not, that the source may be "spammy" in nature just because it has "+all" in its SPF record.  However, I personally would deny such a result (pass -> "+all") anyway at SMTP "MAIL FROM", thus such would never reach SA to be evaluated.

As far as statistics go as to how many sites have SPF records with "+all", that need not be done with a mail server.  In fact, some work has already been done - see http://spf-all.com/stats.html for details.  The survey at that web site identifies roughly a 10% infiltration of SPF across all domains sampled, and within that 10%, 1/100 have "+all" terminated records (26k of 2.6M out of 25.6M domains sampled).  A larger number of domains have SPF record errors than "+all".
Comment 18 Andrew Kinnard 2010-11-16 14:23:36 UTC
(In reply to comment #17)
Brother, it seems you have a disaffinity for processing SPF records with SA (more than a functional [vs philosophical] objection to the patch's intent).  I think that is different from having a relevant objection to the patch.

Does is break anything or change existing functionality?...No.  Does it force you to process SPF records with SA?...No.  Does it harshly penalize domains who unintentionally set up SPF with a "+all"?...No, not unless you score the rule with reckless abandon.

Does it address spam indirectly (through a mild perversion of other technology)?...Yes.  So do lots of standard SA rules.  Can it be bypassed (with a clever transmutation of "+all")?...Yes.  Does that really matter?...No, because during the time we were seeing this spam, I never had ANY at ANY site that came through with a "+all" written as a CIDR in the SPF record.  It's simple, effective, and addressed the problem at the time.  So, the patch did what it was supposed to.

You can object based on that, but you'd have to also take up arms over rules that look at rDNS and FCrDNS as well as a host of rules that don't examine anti-spam-specific technologies or filter based on email content.
> Having "+all" can also be a sign of 1) A lazy admin., or 2) one who really
> doesn't care*.  Regardless, SPF is not a spam solution but a forgery solution.
SA authors would disagree with you about the relevance of SPF to "spamminess" (or the SPF rules would not be in SA).
> Granted, much spam is also forged, but so can non-spam be forged.  Deferring
> action to the data phase where SA gets the message is too late in the SMTP
> transaction.
...according to your sensibilities.  If I want to do a minor manipulation of SA scores based on a used-in-the-wild misconfiguration that was aimed at getting more spam delivered, it's going to have to be after mail is spooled.  Otherwise the decision has to be binary.  This is much the same quandry one faces when implementing RBLs: Do you let your mta handle it (in a binary fashion) at the connect phase, or do you let SA apply a more moderated RBL effect on judging a spooled email's spamminess?
> * - Especially if another anti-forgery method (DK/DKIM or PGP signatures) is
> always deployed in messages from that source, there should be no such
> conclusion, implied or not, that the source may be "spammy" in nature just
> because it has "+all" in its SPF record.
True that it would be self-defeating for a spammer to identify him- or her-self by PGP or DKIM signing spam.  However, I'm willing to bet that there are VERY few domains implementing PGP and/or DKIM that have a misconfigured SPF record; the expertise required to implement either of the two anti-forgery schemes far outweighs that required to generate a decent SPF record.  In any case, the default score on this rule is very small, and should not push legitimate email into a false positive score.
> However, I personally would deny such
> a result (pass -> "+all") anyway at SMTP "MAIL FROM", thus such would never
> reach SA to be evaluated.
...which is a binary method of responding to "+all" SPF records; you are denying email delivery based on the outcome of this one test -- it's all or nothing.  You see it as superior, and I do not.  My patch just provides another option, one you would not be forced to use even if the code were merged.
> As far as statistics go as to how many sites have SPF records with "+all", that
> need not be done with a mail server.  In fact, some work has already been done
> - see http://spf-all.com/stats.html for details.  The survey at that web site
> identifies roughly a 10% infiltration of SPF across all domains sampled, and
> within that 10%, 1/100 have "+all" terminated records (26k of 2.6M out of 25.6M
> domains sampled).  A larger number of domains have SPF record errors than
> "+all".
...and, in the settings for which this patch was created, we felt that penalizing these domains just a little was an acceptible price for that same little bit of extra protection from those domains whose emails were just eeking through under our spam threshold scores.

(In reply to comment #16)
> (In reply to comment #15)
> > Consider:  "v=spf1 a mx all".
> > 
> > Clearly, if the "a" or "mx" mechanism is the reason for the passing result,
> > there is nothing "funny" going on here, despite having a "+all" in the record.
I don't consider that clear at all.
> The mere fact of having a +all, whether or not it is the mechanism used to pass
> SPF, could be a spam sign anyway. I could imagine it being an indicator similar
> in impact to, say, FREEMAIL_FROM, and useful to SA at least as a component of
> metarules. But this is just a postulate; I've no idea whether current usage
> bears this out. 
> Passing SPF only by virtue of a +all (or similarly all-inclusive pattern) could
> be another spam sign; I'd expect, again without having checked the actual data,
> that it would be a more discriminant spam sign. 
> Adding the SPF data in a pseudo-header would be a nice first step towards
> collecting some real world statistics here...

Since SA rules don't function as binary switches, the capability introduced by this patch is totally optional; you can score this rule however you like, or not use the functionality at all.

This patch was submitted because others incouraged me to do so.  It was created in the context of me (a non-programmer with almost zero Perl experience) and one other SA user meeting in a forum and responding to a specific pattern of spam we were seeing at the time.  The senders were using SPF records including "+all" in order to allow their Botnets to deliver their email while earning a SPF "Pass" and the associated negative SA score for SPF rules.  We were just trying to "undo" the little "bump" they were giving themselves (and we felt the penalty on domains with admins lazy or incompetent enough to actually create an SPF record that is totally worthless was worth paying and unlikely to create false positives).
Comment 19 D. Stussy 2010-11-16 19:06:01 UTC
Created attachment 4824 [details]
Alternative to proposed patch for SA - a MIMEDefang solution.

This is a complete routine, but not a complete filter program.
Comment 20 D. Stussy 2010-11-16 19:54:30 UTC
Actually, no.  I let the following SPF results into SA for scoring:  pass, neutral, softfail, and none; the latter three results receiving non-zero scores.  Failures, errors, or passing SPF records based on "+all" or wide CIDR masks ("/0" ... "/7") are taken care of during the SMTP "MAIL FROM" state (via my MIMEDefang filter routine), thus never reaching SA to be scored.

SPF "pass" receives a negative SA score?  Interesting, but misguided.

If per SPF, I know that a message is forged and therefore will be rejected, why should I continue to process it?  I save computational power and bandwidth by making a decision as soon as possible to terminate the SMTP transaction with an error.  All the information for an SPF result is available at "MAIL FROM:", so why should I consider anything that changes the result at any later step?

With your patch, you could be scoring a "+all" when SPF returns failure or other states if such strange SPF strings direct such an outcome.  Does that make sense?  Should "+all" be a factor when another mechanism produces the result?  I say no - and that's why I look for "+all" (or a wide CIDR match) only when the SPF result is pass - because that is the result beneficial to spammers and their trickery.


Not all forged messages need be spam.  Practical jokes and revenge messages (usually by former lovers or other close relationships) are good examples of non-spammy forgeries.  SPF is an anti-forgery solution, not an anti-spam one.  Granted that much spam is also forged, thus yielding a good reason for SA inclusion, I note that about half of the people writing about SPF mischaracterize it.  Also, your "+all" check in SPF will undoubtedly be used as an override to the SPF "pass" result with a positive SA score so as to cause message rejection, so why defer that to the SMTP message eom state when such information is available earlier?

You are correct that I consider the forged state as binary in nature.  Why shouldn't it be?  The characteristic itself is binary, with basically a tri-state result set:  Yes (2 results), no (1 result), or undetermined (4 results, including errors).

Since you asked -- I consider these operations as:
 FCrDNS:  Binary.  I reject failures before the "HELO" state.
 DNSBLs:  I consider 4 of them as binary and reject immediately. I score the rest because they have some false positives and are thus not as reliable.


I don't disagree with the overall intent of your patch.  I clearly disagree with the implementation of that intent.  I also note that your solution may cause the score to be altered when "+all" is present but UNUSED in determining the SPF result and call that inappropriate to the "corrective measure" of adding this functionality in this manner.  My solution may be a bit more strict, but it also generates a proper RFC 5451 answer (the "Authentication-Results:" header), and only penalizes those SPF results where "+all" is the determining mechanism, plus the bonus of it's not fooled by CIDR equivalents.
Comment 21 Andrew Kinnard 2010-11-17 11:42:52 UTC
(In reply to comment #20)
> Actually, no.  I let the following SPF results into SA for scoring...passing SPF records based on "+all"...
> are taken care of during the SMTP "MAIL FROM" state...
Fair enough, but, as I said before, we didn't want to respond to +all SPF matches in a binary fashion; so, it goes to SA.
> SPF "pass" receives a negative SA score?  Interesting, but misguided.
> If per SPF, I know that a message is forged and therefore will be rejected, why
> should I continue to process it?...
You don't have to; you have the OPTION.  See below about "philosophical objections".
> With your patch, you could be scoring a "+all" when SPF returns failure or
> other states if such strange SPF strings direct such an outcome.
So what?  If there is a +all SPF record AND an SPF failure, I'm going to be highly suspicious of that email.  We wanted to consider email from any domain with a +all as slightly suspect (regardless of other factors).
> Does that
> make sense?  Should "+all" be a factor when another mechanism produces the
> result?  I say no...
Ok...how does that make the patch's functionality irrelevant for others (e.g., amavisd-new users)?
> - and that's why I look for "+all" (or a wide CIDR match)
> only when the SPF result is pass - because that is the result beneficial to
> spammers and their trickery.
Fantastic!  Thank you for posting your MIMEdefang solution; it IS, in fact, a technically superior solution for MIMEdefang users who have the intent it serves.
> Not all forged messages need be spam...SPF is an anti-forgery solution, not an anti-spam one...
This is a philosophical objection of the type I pointed out earlier.  It is really irrelevant to the discussion unless you want to debate (elsewhere) whether SA should consider information not originally intended to be used in the fight against UCE and spam.  I am sure you know that many SA rules are derived from statistical analysis of email traffic and may leverage information that is, from a philosophical standpoint, neutral (e.g., HTML formatting); so, you'd have a lot of arguing to do.  And, for the most part, those arguments have been had and solved with SA's ability to score rules as you wish.
>Also, your "+all" check in SPF will undoubtedly be used as
> an override to the SPF "pass" result with a positive SA score so as to cause
> message rejection...
That was exactly the intent.  As I stated earlier: That would only create false positives if the user scored this rule recklessly.
> ...so why defer that to the SMTP message eom state when such
> information is available earlier?
Because we didn't use MIMEdefang then, and this patch fit our purposes.
> You are correct that I consider the forged state as binary in nature.
You misread (or I miscommunicated) my statement: I intended to point out your disaffinity for processing this particular piece of SPF record information in SA and your affinity for treating email from MTAs passing SPF via +all as fit for outright rejection before spooling.
> Why shouldn't it be?  The characteristic itself is binary...
I agree, but we weren't trying to answer a question about forgery with the patch.  The question we were asking was: does this domain have a "+all" in their SPF record?...that's all.  We were not considering this in a determination about forgery but (as a very small factor) in a determination about spamminess.  You don't have to like that, but it doesn't violate any guidelines about SA rules, nor did it result in any real problems in practice.
> Since you asked -- I consider these operations as:
>  FCrDNS:  Binary.  I reject failures before the "HELO" state.
>  DNSBLs:  I consider 4 of them as binary and reject immediately. I score the
> rest because they have some false positives and are thus not as reliable.
I take a completely opposite approach.  I prefer all these handled more "gently" (than an MTA can do) by SA.  For most of my sites, placing these checks at the MTA would result in too many false positives in a very real, immediate and business-impacting way.  Have you ever argued with the "other" email admin about how they need to fix their PTR record(s)?  Have you ever explained to your client that all email from a major ISP got bounced yesterday because a here-to-for reliable RBL listed the provider's MTAs?  Yeah, that usually doesn't go well.  Business owners don't care WHY you had a false positive, just that there WAS one and that it disrupted business.
So, again, I feel like this is the crux of your objection (i.e., preference for pre- or post-spool processing of certain information).  That's not an objection to the patch, it is stating that you don't like its function handled post-spool and that you wouldn't use it.  Yes, the patch can be improved.  Yes, it is not "pure" in its use of certain tech not as originally intended.  However, it does what we intended it to do, and that intent is an acceptible goal within the framework of what SA already does.
> I don't disagree with the overall intent of your patch.
You seem to AEB your continued assertion that we shouldn't consider SPF information in a determination of spamminess and therefore (and for efficiency reasons) should be handled by SA (post-spool).
> I clearly disagree
> with the implementation of that intent.
The only functional criticism has been that it's less efficient than pre-spool SPF parsing and has a very small chance of contributing to a false positive from SA.  Both of these shortcomings we found to be acceptible, and the first is a design/engineering choice, not a shortcoming but a trade-off for desired functionality.
> I also note that your solution may
> cause the score to be altered when "+all" is present but UNUSED in determining
> the SPF result and call that inappropriate to the "corrective measure" of
> adding this functionality in this manner.
We found that not only acceptible but intended by design.
> My solution may be a bit more
> strict, but it also generates a proper RFC 5451 answer (the
> "Authentication-Results:" header), and only penalizes those SPF results where
> "+all" is the determining mechanism, plus the bonus of it's not fooled by CIDR
> equivalents.
...and in those ways it IS superior (if you use MIMEdefang and have a different intent than we did [i.e., a small positive score in SA for domains with an SPF record including "+all"...we WANTED that]).  Also, any rejection from SA would have been based on an accumulated score; so, the patch should NOT return any RFC 5451 answer (even if it could, because the mail wouldn't have been rejected for any ONE reason).  That RFC 5451 answer is appropriate for your solution but not this one.