Bug 5842 - Should SPF rules be "tflags net" or not?
Summary: Should SPF rules be "tflags net" or not?
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 normal
Target Milestone: 3.4.1
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-29 11:37 UTC by Daryl C. W. O'Shea
Modified: 2015-04-12 14:14 UTC (History)
3 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Daryl C. W. O'Shea 2008-02-29 11:37:09 UTC
Now that bug 5239 enables the SPF plugin to reuse results from Received-SPF
headers, should SPF rules be "tflags net" or not?  The rules can work without
network tests enabled.

For 3.2 we generated scores without "tflags net".  My r588457.

This was reverted in jm's r596095.

Now I'm not sure what to do.  We need to generate scores for the rules for set0
(so they shouldn't have tflags net) but those scores probably aren't going to be
very accurate since I don't think many of the mass-check contributors have
Received-SPF headers in their mail.
Comment 1 Justin Mason 2008-03-01 13:33:24 UTC
I say yes, they should be tflags net.  it's exactly analogous to the DNSBL
rules; they are network lookups, which can record their results in a header (the
reuse data), which can then be reused in set 0 if --reuse is specified.
IMO that makes sense for the SPF rules (although the recorded results are
put in Received-SPF:).

what do we need to do to record Received-SPF: btw?
Comment 2 Daryl C. W. O'Shea 2008-03-01 21:16:57 UTC
(In reply to comment #1)
> I say yes, they should be tflags net.  it's exactly analogous to the DNSBL
> rules; they are network lookups, which can record their results in a header (the
> reuse data), which can then be reused in set 0 if --reuse is specified.
> IMO that makes sense for the SPF rules (although the recorded results are
> put in Received-SPF:).

I'm not sure I'd consider them exactly analogous.  With a mail system that adds
Received-SPF headers before SA sees the message SPF results are used (in all,
even non-net, scoresets) without SA *ever* doing the actual network checks.

Reuse of the data later in mass-checks, etc, isn't of concern.  I'm not aware of
--reuse applying to spamassassin or spamd.

> what do we need to do to record Received-SPF: btw?

Record Received-SPF?  We don't.  This would be added by a mail system that
processes the mail before SA sees it.  For example, see the headers of any mail
I've sent to an ASF mailling list.
Comment 3 Justin Mason 2008-03-03 01:53:44 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > I say yes, they should be tflags net.  it's exactly analogous to the DNSBL
> > rules; they are network lookups, which can record their results in a header (the
> > reuse data), which can then be reused in set 0 if --reuse is specified.
> > IMO that makes sense for the SPF rules (although the recorded results are
> > put in Received-SPF:).
> 
> I'm not sure I'd consider them exactly analogous.  With a mail system that adds
> Received-SPF headers before SA sees the message SPF results are used (in all,
> even non-net, scoresets) without SA *ever* doing the actual network checks.

ok, not exactly analogous.  Just partially ;)

> Reuse of the data later in mass-checks, etc, isn't of concern.  I'm not aware of
> --reuse applying to spamassassin or spamd.

true.

> > what do we need to do to record Received-SPF: btw?
> 
> Record Received-SPF?  We don't.  This would be added by a mail system that
> processes the mail before SA sees it.  For example, see the headers of any mail
> I've sent to an ASF mailling list.

I meant, what do we need to do *in our MTA configurations* ;)

Now that I think about it though, I don't do any SPF lookups in any of my MTAs;
I leave that to SpamAssassin.  So maybe we should add support for recording it
(if there isn't a header already there).  Then we *can* use this header as a
way to #reuse SPF lookups, *and* we are more standards-compliant (since I think
that is dictated in the std).  That would help with generating scores for set0
at least.

In the meantime I think it'd be acceptable to make these a special case.
Generate their scores for set2/3, then simply copy those to set0/1.    if we
don't have the data, we can't trust the GA, but we should be able to trust that
the S/O ratios will be the same (since it's the same domains and the same
lookup logic!).

However that doesn't solve the core issue -- "tflags net".  we need to keep the
network lookup code running with "tflags net". This is necessary for the
--reuse support, so that it knows to set the rule score to 0 when attempting to
reuse hits.   However as you note, this means the code doesn't run in set0 at
all.

What I did in the past with the ROUND_THE_WORLD test was to split it into two
rules, ROUND_THE_WORLD and ROUND_THE_WORLD_LOCAL; the latter was set0, the
former set2. What about doing that with the SPF rules -- adding a duplicate
ruleset for SPF_PASS_LOCAL, SPF_NEUTRAL_LOCAL etc.?  (better names welcome of
course.) We could even move the current set to __SPF_PASS_LOCAL, __SPF_PASS_NET
and combine them into a new SPF_PASS meta rule.  This would be
--reuse-compatible, and clearly delineate set0 and set2 rules.

Comment 4 Daryl C. W. O'Shea 2008-03-03 20:37:58 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #1)

> > > what do we need to do to record Received-SPF: btw?
> I meant, what do we need to do *in our MTA configurations* ;)

Pick a milter, any milter:

http://svn.perl.org/viewvc/qpsmtpd/trunk/plugins/sender_permitted_from?view=markup

http://www.openspf.org/Implementations

> Now that I think about it though, I don't do any SPF lookups in any of my MTAs;
> I leave that to SpamAssassin.  So maybe we should add support for recording it
> (if there isn't a header already there).  Then we *can* use this header as a
> way to #reuse SPF lookups, *and* we are more standards-compliant (since I think
> that is dictated in the std).  That would help with generating scores for set0
> at least.

"It is RECOMMENDED that SMTP receivers record the result of SPF processing in
the message header."

We could, I guess.  It'd give us a positive indication that the SPF checks were
actually done for domains that don't publish SPF records.

> In the meantime I think it'd be acceptable to make these a special case.
> Generate their scores for set2/3, then simply copy those to set0/1.

1/3 and 0/2 :)

> if we
> don't have the data, we can't trust the GA, but we should be able to trust that
> the S/O ratios will be the same (since it's the same domains and the same
> lookup logic!).

Yeah, we're probably going to have to do some copying.  I'm not sure who's
corpus was responsible for the scores that were generated for 3.2.

> However that doesn't solve the core issue -- "tflags net".  we need to keep the
> network lookup code running with "tflags net". This is necessary for the
> --reuse support, so that it knows to set the rule score to 0 when attempting to
> reuse hits.

Really?  Isn't that what #reuse is supposed to be taking care of?  Why a dual
dependency on both #reuse and "tflags net"?

> However as you note, this means the code doesn't run in set0 at all.
> 
> What I did in the past with the ROUND_THE_WORLD test was to split it into two
> rules, ROUND_THE_WORLD and ROUND_THE_WORLD_LOCAL; the latter was set0, the
> former set2. What about doing that with the SPF rules -- adding a duplicate
> ruleset for SPF_PASS_LOCAL, SPF_NEUTRAL_LOCAL etc.?  (better names welcome of
> course.)

I *really* want to avoid different rules like above.  It'll confuse people and
cause those who aren't confused to write metas to combine the two versions.

> We could even move the current set to __SPF_PASS_LOCAL, __SPF_PASS_NET
> and combine them into a new SPF_PASS meta rule.  This would be
> --reuse-compatible, and clearly delineate set0 and set2 rules.

This might complicate score generation further. :(
Comment 5 Justin Mason 2008-03-04 01:21:57 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > (In reply to comment #1)
> > if we don't have the data, we can't trust the GA, but we should be able to
> > trust that the S/O ratios will be the same (since it's the same domains and
> > the same lookup logic!).
> 
> Yeah, we're probably going to have to do some copying.  I'm not sure who's
> corpus was responsible for the scores that were generated for 3.2.

well, we had no form of reuse for SPF, so the scores were based on whatever
SPF records were in place at the time of mass-check.  in my opinion, basing
scores on this is better than on no data at all.

> > However that doesn't solve the core issue -- "tflags net".  we need to keep
> > the network lookup code running with "tflags net". This is necessary for
> > the --reuse support, so that it knows to set the rule score to 0 when
> > attempting to reuse hits.
> 
> Really?  Isn't that what #reuse is supposed to be taking care of?  Why a dual
> dependency on both #reuse and "tflags net"?

We could modify the code so that it knows that #reuse implies "tflags net",
sure.  basically I didn't want the effects of #reuse to be too widespread
in the main Mail::SpamAssassin classes, since it's only supposed to affect
mass-checks.

> > What I did in the past with the ROUND_THE_WORLD test was to split it into two
> > rules, ROUND_THE_WORLD and ROUND_THE_WORLD_LOCAL; the latter was set0, the
> > former set2. What about doing that with the SPF rules -- adding a duplicate
> > ruleset for SPF_PASS_LOCAL, SPF_NEUTRAL_LOCAL etc.?  (better names welcome of
> > course.)
> 
> I *really* want to avoid different rules like above.  It'll confuse people and
> cause those who aren't confused to write metas to combine the two versions.

I see your point -- they are pretty messy.  other suggestions?
Comment 6 Daryl C. W. O'Shea 2008-03-04 11:09:59 UTC
(In reply to comment #5)
> well, we had no form of reuse for SPF, so the scores were based on whatever
> SPF records were in place at the time of mass-check.  in my opinion, basing
> scores on this is better than on no data at all.

Actually we did have #reuse for the SPF tests.  I was actually questioning where
the set0/2 scores for the SPF rules came from, but of course, they were
generated based on the results of single set3 mass-check (just like any other
set0/2 rules).  So that's fine (assuming all/most of the corpus submitters have
SPF checks enabled).

> We could modify the code so that it knows that #reuse implies "tflags net",
> sure.  basically I didn't want the effects of #reuse to be too widespread
> in the main Mail::SpamAssassin classes, since it's only supposed to affect
> mass-checks.

I can't think of any reason why #reuse must or should only apply to net rules. 
If you wanted to #reuse bayes rules (which I've been considering for a while) I
think you should be able to.  Same goes for any other rule if it makes sense for
that rule.

> I see your point -- they are pretty messy.  other suggestions?

I don't see what the harm would be of *not* having "tflags net" for the SPF
rules and just having mass-check #reuse whatever we tell it to.
Comment 7 Matthew Schumacher 2009-11-03 18:07:52 UTC
I figured I would add a few thoughts since I'm working this issue in my mail system now.

I think that this should be set to "tflags net" even if SA doesn't need to go do the actual lookup.  The reason is because sometimes it's reasonable to omit the network rules for reasons other than performance.  Suppose you want to allow a user to authenticate and relay without worrying about RBLs, SPF, or other network tests, but you don't want to completely omit the message from SA either.  In this case turning off network tests works well, but only if SPF is a network test.  If it isn't a network test, then the authenticated user will get hit with SPF rules unless you whitelist your domain which isn't desirable for other reasons.

schu
Comment 8 Daryl C. W. O'Shea 2009-11-12 18:41:43 UTC
(In reply to comment #7)
> Suppose you want to
> allow a user to authenticate and relay without worrying about RBLs, SPF, or
> other network tests, but you don't want to completely omit the message from SA
> either.  In this case turning off network tests works well, but only if SPF is
> a network test.  If it isn't a network test, then the authenticated user will
> get hit with SPF rules unless you whitelist your domain which isn't desirable
> for other reasons.

SA provides numerous ways to detect that a user is authenticated.  Turning off net tests is not the correct, nor accurate, way to handle this situation.
Comment 9 Justin Mason 2010-01-27 02:20:25 UTC
moving most remaining 3.3.0 bugs to 3.3.1 milestone
Comment 10 Justin Mason 2010-01-27 03:16:16 UTC
reassigning, too
Comment 11 Justin Mason 2010-03-23 16:33:22 UTC
moving all open 3.3.1 bugs to 3.3.2
Comment 12 Karsten Bräckelmann 2010-03-23 17:42:35 UTC
Moving back off of Security, which got changed by accident during the mass Target Milestone move.
Comment 13 Kevin A. McGrail 2013-06-21 16:14:21 UTC
Moving all open bugs where target is defined and 3.4.0 or lower to 3.4.1 target
Comment 14 Kevin A. McGrail 2015-04-12 14:14:30 UTC
leaving as tflags Net for SPF