Bug 6668

Summary: DNSWL is lacking a rule to communicate excessive use to users
Product: Spamassassin Reporter: Darxus <Darxus>
Component: RulesAssignee: SpamAssassin Developer Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: Darxus, dnswl, drmres, jhardin, joao.gouveia, kmcgrail, matthias, software+spamassassin, steve
Priority: P2    
Version: unspecified   
Target Milestone: Undefined   
Hardware: All   
OS: All   
Whiteboard:

Description Darxus 2011-10-03 03:26:24 UTC
In bug #6220 it was discussed that Spam Eating Monkey has a way to trigger SpamAssassin to intentionally cause false positives by returning a value of 127.0.0.255 in cases where people are abusing their service with excessive load.

DNSWL.org has had this kind of problem recently, with some folks who have been particularly difficult to contact about it, and has resorted to returning a trust value of "HI" to all queries from the problematic users.

I'd like to provide DNSWL with a better option, to handle a return value of 127.*.*.255, and instead of hitting "RCVD_IN_DNSWL_HI", hit a rule that explains that there is a problem with abusive levels of load on the DNSWL servers.

How was that implemented for Spam Eating Monkey?  There doesn't seem to be a rule to match *.255.


Should I create a rule like this?

score RCVD_IN_DNSWL_ABUSE -100 # I figure getting it noticed quick is best for everybody?

##{ RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval

ifplugin Mail::SpamAssassin::Plugin::DNSEval
header  RCVD_IN_DNSWL_ABUSE        eval:check_rbl_sub('dnswl-firsttrusted', '^127\.0\.\d+\.255$')
describe RCVD_IN_DNSWL_ABUSE       You are using a DNS server that is placing too high a load on the DNSWL.org DNS servers without a subscription, please see https://subscription.dnswl.org/
tflags RCVD_IN_DNSWL_ABUSE         nice net
endif
##} RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval


Returning _HI for everything is resulting in many false negatives for the abusing users, and thinking about ideal scores for this kind of situation, I think maybe a large negative score should be used for things like SEM as well, because not filtering out spam is always a much better failure mode than filtering too much as spam.

Also, I think it's really irresponsible for SpamAssassin to expose users to this kind of punitive activity without actually warning them of the usage thresholds of the services involved, as Warren lists here:  http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html


The DNSWL folks who started making this use of _HI are probably not aware of this option, and I just heard this was happening for the first time, so I'm going to go point them to this bug now.  (For those who may be new, I'm a DNSWL admin.)
Comment 1 Kevin A. McGrail 2011-10-03 12:27:55 UTC
> ##{ RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval
> 
> ifplugin Mail::SpamAssassin::Plugin::DNSEval
> header  RCVD_IN_DNSWL_ABUSE        eval:check_rbl_sub('dnswl-firsttrusted',
> '^127\.0\.\d+\.255$')
> describe RCVD_IN_DNSWL_ABUSE       You are using a DNS server that is placing
> too high a load on the DNSWL.org DNS servers without a subscription, please see
> https://subscription.dnswl.org/
> tflags RCVD_IN_DNSWL_ABUSE         nice net
> endif
> ##} RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval

I would personally veto this immediately.  We are not an advertising service for RBLs.

If an RBL is submitted for inclusion for SA, it should not have policies that would affect anything but the most extreme cases.  Any URLs should point to an SA page such as a wiki letting them know to disable the rules.

> Also, I think it's really irresponsible for SpamAssassin to expose users to
> this kind of punitive activity without actually warning them of the usage
> thresholds of the services involved, as Warren lists here: 
> http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html

I agree.  What RBLs have this issue and I will immediate work to disable them in a default SA installation for the 3.4.0 release?

regards,
KAM
Comment 2 AXB 2011-10-03 12:58:45 UTC
(In reply to comment #1)
> > ##{ RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval
> > 
> > ifplugin Mail::SpamAssassin::Plugin::DNSEval
> > header  RCVD_IN_DNSWL_ABUSE        eval:check_rbl_sub('dnswl-firsttrusted',
> > '^127\.0\.\d+\.255$')
> > describe RCVD_IN_DNSWL_ABUSE       You are using a DNS server that is placing
> > too high a load on the DNSWL.org DNS servers without a subscription, please see
> > https://subscription.dnswl.org/
> > tflags RCVD_IN_DNSWL_ABUSE         nice net
> > endif
> > ##} RCVD_IN_DNSWL_ABUSE ifplugin Mail::SpamAssassin::Plugin::DNSEval
> 
> I would personally veto this immediately.  We are not an advertising service
> for RBLs.
> 
> If an RBL is submitted for inclusion for SA, it should not have policies that
> would affect anything but the most extreme cases.  Any URLs should point to an
> SA page such as a wiki letting them know to disable the rules.
> 
> > Also, I think it's really irresponsible for SpamAssassin to expose users to
> > this kind of punitive activity without actually warning them of the usage
> > thresholds of the services involved, as Warren lists here: 
> > http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
> 
> I agree.  What RBLs have this issue and I will immediate work to disable them
> in a default SA installation for the 3.4.0 release?

Warren forgot URIBL.com in his list - afaik, also has a limit of 300k queries/day which, like the others is usually enough for the average site.

IF the BL query limits hit ISPs/service providers/huge corps they're freeriding on (in most cases) donated resources and being cheap so nobody should be surprised if their queries are blocked/filtered
Comment 3 Darxus 2011-10-03 17:53:51 UTC
(In reply to comment #1)
> I would personally veto this immediately.  We are not an advertising service
> for RBLs.

I find that statement kind of interesting, when shutting off network tests, many of which require payment over some threshold (often around 100,000 hits a day), makes SpamAssassin five times less accurate.  5.35x the false positives, and 4.25x the false negatives, based on the 2011-03-24 score generation.  And that's if SA *knows* the network tests aren't working.  What if it's expecting the tests to work, and the major ones aren't because of going over their (free use) thresholds?  Probably bad.

I'm not happy about it, but SA seems pretty dependent on things like RBLs which, under some circumstances, charge money.


From the Ubuntu SpamAssassin 3.3.1 package:

/usr/share/doc/spamassassin/rules/STATISTICS-set0.txt.gz (no bayes, no net)
# SUMMARY for threshold 5.0:
# False positives:       238  1.12%
# False negatives:      9678  21.93%

/usr/share/doc/spamassassin/rules/STATISTICS-set1.txt.gz (no bayes, net enabled)
# SUMMARY for threshold 5.0:
# False positives:        30  0.14%
# False negatives:      1381  3.13%

7.93x the false positives, 7.01x the false negatives, without network tests.  

> If an RBL is submitted for inclusion for SA, it should not have policies that
> would affect anything but the most extreme cases.  Any URLs should point to an
> SA page such as a wiki letting them know to disable the rules.

I think the cases where DNSWL has done are likely to qualify as "most extreme".  

> > Also, I think it's really irresponsible for SpamAssassin to expose users to
> > this kind of punitive activity without actually warning them of the usage
> > thresholds of the services involved, as Warren lists here: 
> > http://www.spamtips.org/2011/01/usage-limits-of-spamassassin-network.html
> 
> I agree.  What RBLs have this issue and I will immediate work to disable them
> in a default SA installation for the 3.4.0 release?

According to Michael Scheidell, Spamhaus's (providers of ZEN, SBL, PBL, XBL, included in SA by default) policy of blocking queries results in "10 and 20 min delays in inbound email" - bug #6220.  You could call that DOSing email providers, instead of disabling spam filtration, both with the same goal of getting the provider to disable the relevant network tests.  Which is worse?

Should the Spamhaus rules be removed from the default SA rule set because they will DOS email providers for querying them for over 100,000 emails per day?

SEM (bug #6220) is the only one I know of that affects scores.  And by a mechanism that seemed to have the approval of SpamAssassin folks.  Should that bug be closed, and the rules not included in SA by default, because of that mechanism?


I think it would be great if SpamAssassin, by default, didn't include any network rules that have limits on free use.  Although it would probably require more work to improve the accuracy, which I don't really see happening.
Comment 4 Kevin A. McGrail 2011-10-03 18:24:58 UTC
(In reply to comment #3)
> (In reply to comment #1)
> > I would personally veto this immediately.  We are not an advertising service
> > for RBLs.
> 
> I find that statement kind of interesting, when shutting off network tests,
> many of which require payment over some threshold (often around 100,000 hits a
> day), makes SpamAssassin five times less accurate.  

IMO, ANY provider that gives FALSE positives under any circumstances should not be configured to be enabled by default with SA.

I have zero problem with them stopping their replies and zero problems with them charging for heavy usage.

> I'm not happy about it, but SA seems pretty dependent on things like RBLs
> which, under some circumstances, charge money.

RBLs have a good place in anti-spam work.  However, the concept that SA can be deployed out of the box with zero config and work well is likely unattainable due to the commercial realities of the world.

> According to Michael Scheidell, Spamhaus's (providers of ZEN, SBL, PBL, XBL,
> included in SA by default) policy of blocking queries results in "10 and 20 min
> delays in inbound email" - bug #6220.  You could call that DOSing email
> providers, instead of disabling spam filtration, both with the same goal of
> getting the provider to disable the relevant network tests.  Which is worse?

False positives are worse from an anti-Spam perspective.  

> Should the Spamhaus rules be removed from the default SA rule set because they
> will DOS email providers for querying them for over 100,000 emails per day?

I don't consider a delay a DoS. It's not keeping the sendmail/spamc process grinding for 10-20 minutes is it?  It's just causing the mail to await for a second delivery.  That's "normal" for email as it is not an method of IM.  

The method to reduce the delay is simple: disable the RBL tests or pay for the RBL providers services, etc.

> SEM (bug #6220) is the only one I know of that affects scores.  And by a
> mechanism that seemed to have the approval of SpamAssassin folks.  Should that
> bug be closed, and the rules not included in SA by default, because of that
> mechanism?

IMO, the mechanism should be changed to point to a URL controller by SA.

> I think it would be great if SpamAssassin, by default, didn't include any
> network rules that have limits on free use.  Although it would probably require
> more work to improve the accuracy, which I don't really see happening.

That's unrealistic as there are great services that have reasonable thresholds for use. 

Regards,
KAM
Comment 5 John Hardin 2011-10-03 18:55:04 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > I would personally veto this immediately.  We are not an advertising service
> > > for RBLs.
> > 
> > I find that statement kind of interesting, when shutting off network tests,
> > many of which require payment over some threshold (often around 100,000 hits a
> > day), makes SpamAssassin five times less accurate.  
> 
> IMO, ANY provider that gives FALSE positives under any circumstances should not
> be configured to be enabled by default with SA.
> 
> I have zero problem with them stopping their replies and zero problems with
> them charging for heavy usage.

If the RBL provides a documented "You are overusing the free service" return code, what is the problem with recognizing that and hitting a non-scoring (0.001, neither FP nor FN) rule with an informative description? It doesn't need to contain a link to the RBL's TOS or subscription page (advertising), but telling the admin _why_ they're getting an unusable response from the RBL is polite.

I think that's a much better approach than either removing one of the most effective antispam techniques by default, or having the RBL suddenly mark _everything_ as spam because we don't interpret the "overuse" code correctly.
Comment 6 Kevin A. McGrail 2011-10-03 19:00:49 UTC
> If the RBL provides a documented "You are overusing the free service" return
> code, what is the problem with recognizing that and hitting a non-scoring
> (0.001, neither FP nor FN) rule with an informative description? It doesn't
> need to contain a link to the RBL's TOS or subscription page (advertising), but
> telling the admin _why_ they're getting an unusable response from the RBL is
> polite.
> 
> I think that's a much better approach than either removing one of the most
> effective antispam techniques by default, or having the RBL suddenly mark
> _everything_ as spam because we don't interpret the "overuse" code correctly.

I don't have a problem with this concept.  My veto statement above had to do with the actual URL and the description which were a direct link and advertisement for a vendor.

A more generic message such as this would be fine and +1'd by me:

The RBL responded with a failure code.  Visit www.spamassassin.org/rbl for more information.

Regards,
KAM
Comment 7 Darxus 2011-10-03 19:16:27 UTC
(In reply to comment #6)
> I don't have a problem with this concept.  My veto statement above had to do
> with the actual URL and the description which were a direct link and
> advertisement for a vendor.

Ah, that was just a rough guess at how it should be implemented.  A url to a spamassassin page certainly seems entirely appropriate to me.  

> A more generic message such as this would be fine and +1'd by me:
> 
> The RBL responded with a failure code.  Visit www.spamassassin.org/rbl for more
> information.

That sounds great to me as well.  Although I'd prefer something in the wiki for maintainability, I don't know, maybe http://wiki.apache.org/spamassassin/XBLAbuse ?


(In reply to comment #4)
> > getting the provider to disable the relevant network tests.  Which is worse?
> 
> False positives are worse from an anti-Spam perspective.  

You mis-read what I said.  I never suggested false positives (in fact I suggested it was bad that SEM intentionally caused false positives).  I was talking about causing false negatives (spam being marked as non-spam).  

> That's unrealistic as there are great services that have reasonable thresholds
> for use. 

So the question is, what are acceptable methods of enforcing those thresholds?  Blocking queries resulting in delay of email is acceptable to you.  I don't know how effective that is in getting people to stop querying, and it doesn't provide any feedback to indicate that there is a problem.  Is it acceptable to cause false-negatives, spam being marked as non-spam, with clear indication (via a matching rule and description) of what the problem is?
Comment 8 Darxus 2011-10-03 19:17:39 UTC
(In reply to comment #5)
> If the RBL provides a documented "You are overusing the free service" return
> code, what is the problem with recognizing that and hitting a non-scoring
> (0.001, neither FP nor FN) rule with an informative description?

The problem, from the perspective of DNSWL.org, is that it provides no incentive to stop sending millions of queries a day.
Comment 9 Kevin A. McGrail 2011-10-03 19:24:01 UTC
(In reply to comment #8)
> (In reply to comment #5)
> > If the RBL provides a documented "You are overusing the free service" return
> > code, what is the problem with recognizing that and hitting a non-scoring
> > (0.001, neither FP nor FN) rule with an informative description?
> 
> The problem, from the perspective of DNSWL.org, is that it provides no
> incentive to stop sending millions of queries a day.

Then the RBL should be disabled by SA by default and the RBL should consider a sign-up procedure for activation so they have an out of band contact method.
Comment 10 Kevin A. McGrail 2011-10-03 19:34:17 UTC
> That sounds great to me as well.  Although I'd prefer something in the wiki for
> maintainability, I don't know, maybe
> http://wiki.apache.org/spamassassin/XBLAbuse ?

Again, generic and I don't consider this necessarily "abuse".  That's a very strong word to many people. 

To me, it's an error requiring administrative attention with a landing page to help them try and resolve the issue.  Nothing more, nothing less.

> You mis-read what I said.  I never suggested false positives (in fact I
> suggested it was bad that SEM intentionally caused false positives).  I was
> talking about causing false negatives (spam being marked as non-spam).  

That is not the correct definition of a FN in my opinion.  By your definition, any email that got through SA for any reason is a False Negative.

We have to ship SA in a way that is safe for the vast majority of users which might not be the most effective for blocking all Spam.

 
> So the question is, what are acceptable methods of enforcing those thresholds? 
> Blocking queries resulting in delay of email is acceptable to you.  I don't
> know how effective that is in getting people to stop querying, and it doesn't
> provide any feedback to indicate that there is a problem.  Is it acceptable to
> cause false-negatives, spam being marked as non-spam, with clear indication
> (via a matching rule and description) of what the problem is?

PRIMARILY, I want to see a method which doesn't artificially change the SA scoring up or down substantially.  

An RBL that starts returning ALL true or ALL false for over-limit issues is artificially changing the scores.  No answer or a answer handled as an error would be acceptable.  

At worst, the queries can be stopped by blackholing the requests from overlimit IPs. So this is really a matter for the RBL to handle.

However, some RBLs want to convert those over-limit users into customers and they do so through harmful techniques to get the admin's attention.

Regards,
AKM
Comment 11 Darxus 2011-10-03 20:01:09 UTC
(In reply to comment #10)
> > That sounds great to me as well.  Although I'd prefer something in the wiki for
> > maintainability, I don't know, maybe
> > http://wiki.apache.org/spamassassin/XBLAbuse ?
> 
> Again, generic and I don't consider this necessarily "abuse".  That's a very
> strong word to many people. 

I have no objection to using a word other than "abuse", I just thought "rbl" was both too generic (there are lots of potential subjects relating to RBL) and too specific to blacklists, which this bug is specifically related to a whitelist.  Although if you really want to over-generalize the term RBL to include whitelists (as I think a relevant RFC has done) I wouldn't argue.  Would you like to recommend another URL?

> To me, it's an error requiring administrative attention with a landing page to
> help them try and resolve the issue.  Nothing more, nothing less.

Yep.

> > You mis-read what I said.  I never suggested false positives (in fact I
> > suggested it was bad that SEM intentionally caused false positives).  I was
> > talking about causing false negatives (spam being marked as non-spam).  
> 
> That is not the correct definition of a FN in my opinion.  By your definition,
> any email that got through SA for any reason is a False Negative.

Er, yeah, that sounds like a pretty good definition to me.  Especially if SA actually slaps on a "X-Spam-Status: No" header, which would be the case here.

> We have to ship SA in a way that is safe for the vast majority of users which
> might not be the most effective for blocking all Spam.

I don't see how that is at at all conflicting with what I have suggested.

> PRIMARILY, I want to see a method which doesn't artificially change the SA
> scoring up or down substantially.  
> 
> An RBL that starts returning ALL true or ALL false for over-limit issues is
> artificially changing the scores.  No answer or a answer handled as an error
> would be acceptable.  
> 
> At worst, the queries can be stopped by blackholing the requests from overlimit
> IPs. So this is really a matter for the RBL to handle.

That is certainly one option.
 
> However, some RBLs want to convert those over-limit users into customers and
> they do so through harmful techniques to get the admin's attention.

I don't claim to know the intentions of the owners of DNSWL and SEM.  But I'm not convinced that it's inappropriate to intentionally affect scores (preferably with false negatives instead of false positives) in order to get the attention of an administrator to explain the problem and get them to either stop sending millions of queries a day, or start sending money.


RCVD_IN_DNSWL_HI is currently scored -5.  Would you veto a rule that matched the return value of 127.0.0.255 with a score of -5 and a description that was helpful in resolving the situation that could not be construed as advertising?


Another possibility I brought up 6 months ago in bug #6220 was, when receiving a return value of 127.*.*.255, disabling that rule.  No more load on the provider, no skewed score for the user, no advertising.
Comment 12 Kevin A. McGrail 2011-10-03 21:11:59 UTC
> Would you like to recommend another URL?

The URL I wrote was PURELY a place-holder.  I could have and should have written something that implied more firmly that status such as www.sa.org/foobar.  

The actually link should be determined in the patch.  It likely should go to a wiki article discussing RBL errors.  But it definitely shouldn't be a link to a vendor, shouldn't say abuse and should be kept non specific.  It likely should have information on disabling the RBL rules as well.


> Er, yeah, that sounds like a pretty good definition to me.  Especially if SA
> actually slaps on a "X-Spam-Status: No" header, which would be the case here.
> 
> > We have to ship SA in a way that is safe for the vast majority of users which
> > might not be the most effective for blocking all Spam.
> 
> I don't see how that is at at all conflicting with what I have suggested.

I'm trying to keep my answers too short.  I'll rephrase:

You have suggested that disabling network tests causes FNs because emails then slip through unmarked. I don't consider those true FNs because I consider them FNs caused by a misconfiguration of SA.  SA works best with network tests and we aren't recommending they are disabled.  But some of them need consideration before they are enabled post-installation.


> I don't claim to know the intentions of the owners of DNSWL and SEM.  But I'm
> not convinced that it's inappropriate to intentionally affect scores
> (preferably with false negatives instead of false positives) in order to get
> the attention of an administrator to explain the problem and get them to either
> stop sending millions of queries a day, or start sending money.

We will have to agree to disagree.  

I am 100% convinced it is inappropriate to intentionally affect scores to get the attention of admins.  It is the very definition of collateral damage and something I would strongly advocate against.  But again, I am one vote and this is my opinion.  

> RCVD_IN_DNSWL_HI is currently scored -5.  Would you veto a rule that matched
> the return value of 127.0.0.255 with a score of -5 and a description that was
> helpful in resolving the situation that could not be construed as advertising?

This needs more thought but I would veto it unless the following points are met:

 - the NET result of the rules for the RBL in question in total add up to zero (or subsequently similar e.g. 0.0001, etc.) So if there is a positive score and a negative score, the two together = 0.  In other words, an RBL can't issue a response that incorrectly affects scores on purpose due to limits, technical errors, etc.

 - The description in the Rule was generic, suitable for all RBLs and pointed to a URL under SA's control.  Perhaps even just one rule for all the RBLs that can give an error code response.

> Another possibility I brought up 6 months ago in bug #6220 was, when receiving
> a return value of 127.*.*.255, disabling that rule.  No more load on the
> provider, no skewed score for the user, no advertising.

You are mentioning ideas that need to be adopted by RBLs more so than SA but this sounds a bit like a DoS ready to happen AND it's a case where the rule that implemented this likely couldn't be on by default as shipped by SA.  If they are smart enough to turn on the feature, they likely know enough about RBL queries to perform local caching, rsync, etc.

I run quite a number of RBL public nameservers.  I don't consider the traffic to be that big a deal and I can blackhole queries quite easily.
Comment 13 Darxus 2011-10-03 21:39:19 UTC
(In reply to comment #12)
> > Would you like to recommend another URL?
> 
> The URL I wrote was PURELY a place-holder.  I could have and should have
> written something that implied more firmly that status such as
> www.sa.org/foobar.  
> 
> The actually link should be determined in the patch.  It likely should go to a
> wiki article discussing RBL errors.  But it definitely shouldn't be a link to a
> vendor, shouldn't say abuse and should be kept non specific.  It likely should
> have information on disabling the RBL rules as well.

Agreed.

> > I don't see how that is at at all conflicting with what I have suggested.
> 
> I'm trying to keep my answers too short.  I'll rephrase:
> 
> You have suggested that disabling network tests causes FNs because emails then
> slip through unmarked. I don't consider those true FNs because I consider them
> FNs caused by a misconfiguration of SA.  

I disagree.  That's a false negative, even if it's due to configuration.  

> SA works best with network tests and
> we aren't recommending they are disabled.  But some of them need consideration
> before they are enabled post-installation.

Yep.

> We will have to agree to disagree.  
> 
> I am 100% convinced it is inappropriate to intentionally affect scores to get
> the attention of admins.  It is the very definition of collateral damage and
> something I would strongly advocate against.  

Okay.

> But again, I am one vote and this
> is my opinion.  

"Votes on code modifications follow a different model. In this scenario, a negative vote constitutes a veto , which cannot be overridden."
"...the proposal requires three positive votes and no negative ones in order to pass..."
- http://www.apache.org/foundation/voting.html

By our rules, it's enough on its own to make this not happen.

> > RCVD_IN_DNSWL_HI is currently scored -5.  Would you veto a rule that matched
> > the return value of 127.0.0.255 with a score of -5 and a description that was
> > helpful in resolving the situation that could not be construed as advertising?
> 
> This needs more thought but I would veto it unless the following points are
> met:
> 
>  - the NET result of the rules for the RBL in question in total add up to zero
> (or subsequently similar e.g. 0.0001, etc.) So if there is a positive score and
> a negative score, the two together = 0.  In other words, an RBL can't issue a
> response that incorrectly affects scores on purpose due to limits, technical
> errors, etc.

I believe that requirement would eliminate dnswl.org's interest.  Since you're willing to veto without it, I think that's sufficient to consider this thread dead.

> > Another possibility I brought up 6 months ago in bug #6220 was, when receiving
> > a return value of 127.*.*.255, disabling that rule.  No more load on the
> > provider, no skewed score for the user, no advertising.
> 
> You are mentioning ideas that need to be adopted by RBLs more so than SA

I don't understand why you say that.  It's just another way of handing a 127.0.0.255 within spamassassin.  So as far as RBLs and WLs are concerned it's still just an implementation of providing a .255 response for users who are over limit.

As an example, say an email provider is using spamassassin to filter millions of emails a day.  Some of the rules (RCVD_IN_XBL, RCVD_IN_PBL, RCVD_IN_SBL) cause queries is to zen.spamhaus.org.  That being over their free use threshold, they start returning (only) 127.0.0.255 for all queries, to indicate the over limit condition.  SpamAssassin notices the 127.0.0.255 value, and stops running all rules that hit zen.spamhaus.org.

> but
> this sounds a bit like a DoS ready to happen AND it's a case where the rule
> that implemented this likely couldn't be on by default as shipped by SA.  If
> they are smart enough to turn on the feature, they likely know enough about RBL
> queries to perform local caching, rsync, etc.

How is that a DoS ready to happen?  Are we having another misunderstanding here?
 
> I run quite a number of RBL public nameservers.  I don't consider the traffic
> to be that big a deal and I can blackhole queries quite easily.

Are they RBLs that spamassassin has enabled by default?  I run one dnswl.org mirror, and the only reason I can do that is my provider is willing to overlook my bandwidth limit due to a belief that dnswl is worth supporting.  Mirroring dnswl.org causes almost all of my bandwidth usage.
Comment 14 Kevin A. McGrail 2011-10-03 22:18:28 UTC
> > But again, I am one vote and this
> > is my opinion.  
> 
> "Votes on code modifications follow a different model. In this scenario, a
> negative vote constitutes a veto , which cannot be overridden."
> "...the proposal requires three positive votes and no negative ones in order to
> pass..."
> - http://www.apache.org/foundation/voting.html
>
> By our rules, it's enough on its own to make this not happen.

Good point.  Well I have not voted formally so I don't need to withdraw a vote. So let's continue the discussion and get more votes and I won't submarine it if others agree with you.

> >  - the NET result of the rules for the RBL in question in total add up to zero
> > (or subsequently similar e.g. 0.0001, etc.) So if there is a positive score and
> > a negative score, the two together = 0.  In other words, an RBL can't issue a
> > response that incorrectly affects scores on purpose due to limits, technical
> > errors, etc.
> 
> I believe that requirement would eliminate dnswl.org's interest.  Since you're
> willing to veto without it, I think that's sufficient to consider this thread
> dead.

I would strongly try and convince others it is wrong to purposefully give wrong answers from an RBL that lead to skewed scoring.  If a patch you are proposing skews the scores plus or minus, expect me to request for it to be revised to a net 0.

If DNSWL only wants a case where the scores are skewed to gain attention from admins/users, then it seems they want SA to be a sales lead generator.  This is exactly what I want to prevent.

> I don't understand why you say that.  It's just another way of handing a
> 127.0.0.255 within spamassassin.  So as far as RBLs and WLs are concerned it's
> still just an implementation of providing a .255 response for users who are
> over limit.

Because to me 255 is a legitimate bit mask for a valid response. 

- Do older versions of SA contain code that considers .255 as an invalid response for an RBL?

- Is there agreement among RBLs that .255 is considered an error code?

I would support some standard for an error code but likely it should be something in a different class c such as 192.168.255.X or something similar.

And I have more ideas on it I'll add below.


 
> As an example, say an email provider is using spamassassin to filter millions
> of emails a day.  Some of the rules (RCVD_IN_XBL, RCVD_IN_PBL, RCVD_IN_SBL)
> cause queries is to zen.spamhaus.org.  That being over their free use
> threshold, they start returning (only) 127.0.0.255 for all queries, to indicate
> the over limit condition.  SpamAssassin notices the 127.0.0.255 value, and
> stops running all rules that hit zen.spamhaus.org.

Zen, according to their docs, does not issue a .255. See http://www.spamhaus.org/faq/answers.lasso?section=DNSBL%20Usage#200

But assuming they did, your ISP uses an old version of SA, Zen responds with .255 and it's considered true and legitimate email gets blocked.

In short, an error bitmask will have YEARS of lag in getting an error code in place for RBLs.

The only way I see it could happen is to can get an RBL to announce via alternate names so querying zen.spamhaus.org would never give out .255 but querying zenv2.spamhaus.org could implement an error code response that APIs would know how to properly implement.

> > but
> > this sounds a bit like a DoS ready to happen AND it's a case where the rule
> > that implemented this likely couldn't be on by default as shipped by SA.  If
> > they are smart enough to turn on the feature, they likely know enough about RBL
> > queries to perform local caching, rsync, etc.
> 
> How is that a DoS ready to happen?  Are we having another misunderstanding
> here?

I just see that as an avenue to figure out how to trick your system into getting a DNS response that changes SA not to query an RBL in order to get all my Spam through.  With the number of DNS servers that change responses, this doesn't sound that hard.

> > I run quite a number of RBL public nameservers.  I don't consider the traffic
> > to be that big a deal and I can blackhole queries quite easily.
> 
> Are they RBLs that spamassassin has enabled by default?  I run one dnswl.org
> mirror, and the only reason I can do that is my provider is willing to overlook
> my bandwidth limit due to a belief that dnswl is worth supporting.  Mirroring
> dnswl.org causes almost all of my bandwidth usage.

If DNSWL needs another public mirror, have them email me.  The solution to me is to increase public mirrors not to harm the flow of email to try and get people to use the service less.
Comment 15 Darxus 2011-10-04 21:13:11 UTC
(In reply to comment #14)
> > I don't understand why you say that.  It's just another way of handing a
> > 127.0.0.255 within spamassassin.  So as far as RBLs and WLs are concerned it's
> > still just an implementation of providing a .255 response for users who are
> > over limit.
> 
> Because to me 255 is a legitimate bit mask for a valid response. 

I was providing an example (127.0.0.255), not suggesting that value always be treated this way.  I think it would be necessary to create another eval thing to define a regex for each RBL.

> > As an example, say an email provider is using spamassassin to filter millions
> > of emails a day.  Some of the rules (RCVD_IN_XBL, RCVD_IN_PBL, RCVD_IN_SBL)
> > cause queries is to zen.spamhaus.org.  That being over their free use
> > threshold, they start returning (only) 127.0.0.255 for all queries, to indicate
> > the over limit condition.  SpamAssassin notices the 127.0.0.255 value, and
> > stops running all rules that hit zen.spamhaus.org.
> 
> Zen, according to their docs, does not issue a .255. See
> http://www.spamhaus.org/faq/answers.lasso?section=DNSBL%20Usage#200

Right, just providing an example.

> In short, an error bitmask will have YEARS of lag in getting an error code in
> place for RBLs.

For all of them, yes.

> > How is that a DoS ready to happen?  Are we having another misunderstanding
> > here?
> 
> I just see that as an avenue to figure out how to trick your system into
> getting a DNS response that changes SA not to query an RBL in order to get all
> my Spam through.  With the number of DNS servers that change responses, this
> doesn't sound that hard.

Sounds hard to me (to use this to cause a DoS).

> If DNSWL needs another public mirror, have them email me.  

I'll let them know.


If I don't get any positive responses within a couple days, I'll close this (or someone else can feel free).
Comment 16 D. Stussy 2011-10-05 20:18:14 UTC
What DNSBLs should do is return a result which is not within the 127.0.0.0/8 subnet to indicate an answer which doesn't constitute listing -- especially if they decide not to issue a DNS RC of "refused."  That way, there will be no confusion should some other DNSBL define "127.0.0.255" as a valid reply.  It also works in the case of a shut down DNSBL where a valid IP address from a domain squatter is returned (especially by use of a wildcarded DNS response).

As to detecting an "excessive query" condition and scoring it with a value sufficiently near zero (e.g. 0.001), I am in favor of such an approach.

Future queries to any DNS based list should not happen if a given DNS list returns a "REFUSED" answer (until SA is restarted).  For classic lists, a query returning an A record outside of 127/8 should also be interpreted as "refused."

If "127.0.0.255" is to be treated as a special case of "refused," it should be handled by a rule on a per DNSBL basis.  In other words, I suggest that this type of response is not preferred.

Since classic DNSBLs are all supposed to return "127.0.0.2" for a query for IPv4 address 127.0.0.2, maybe upon SA startup, each DNSBL should be tested for the value.  However, there is a good reason for not performing "unnecessary" queries.  If the entire world rebooted at the same time, would the DNSBLs be DOS'ed with a flood of queries?
Comment 17 Darxus 2011-10-17 20:24:11 UTC
Closing, not going anywhere.

An additional bit of information from Matthias:  In these "abuse" cases, he is initially just blocking the queries.  "Due to the way some DNS resolvers work, this may result in a *higher* query rate, since the resolver just tries it again and again to get an answer."
Comment 18 John Hardin 2011-12-10 18:07:16 UTC
*** Bug 6718 has been marked as a duplicate of this bug. ***
Comment 19 Kevin A. McGrail 2011-12-11 15:45:58 UTC
As noted by Darxus, this is PURPOSEFUL behavior to return true statements for what they consider abuse.

"DNSWL announced this behavior here: 
http://www.dnswl.org/news/archives/24-Abusive-use-of-dnswl.org-infrastructure-enforcing-limits.html"

If they had chosen to add a time-delay, block answers or return a false answer that did not trigger the rules, I would support it.

1 - Do we have any other RBLs enabled by default that return False Positives once a threshold is hit?

2 - IMO they need to be disabled by default -OR- documented far stronger.


Unless someone steps up with some ideas, I'm going to disable DNSWL by default very shortly.
Comment 20 Darxus 2011-12-11 18:58:23 UTC
(In reply to comment #19)
> 1 - Do we have any other RBLs enabled by default that return False Positives
> once a threshold is hit?

I believe we do not.

> Unless someone steps up with some ideas, I'm going to disable DNSWL by default
> very shortly.

I just forwarded this along to the DNSWL admins list so they are aware of the situation.
Comment 21 Matthias Leisi 2011-12-11 20:59:48 UTC
(speaking for dnswl.org)

(In reply to comment #19)
> As noted by Darxus, this is PURPOSEFUL behavior to return true statements for
> what they consider abuse.
> 
> "DNSWL announced this behavior here: 
> http://www.dnswl.org/news/archives/24-Abusive-use-of-dnswl.org-infrastructure-enforcing-limits.html"

Currently, there following nameservers are getting a "listed, high trust" answer (plus the reasons for blocking them):

* Google Public DNS servers (multi-million queries per 24 hours, no response from Google contacts)
* Some big hosting provider resolvers: softlayer.com, dimenoc.com, theplanet.com, bluehost.com, dyndns.com, netline.net.uk (multi-million queries per 24 hours, no response/action from abuse@ and similar contacts)
* Five single hosts with multi-million queries per 24 hours with no response/action from multiple contacts.

The reason for the special result code, as indicated in the posting referenced above, is that REFUSED rcode will result in triple the amount of queries in most cases. 

This is not used for those doing below one million queries per 24 hours (aggregated over those IPs that can be identified as belonging to the same organisation/user).
Comment 22 Matthias Leisi 2011-12-11 21:07:27 UTC
(self-correction)

> * Some big hosting provider resolvers: softlayer.com, dimenoc.com,
> theplanet.com, bluehost.com, dyndns.com, netline.net.uk (multi-million queries

dyndns.com was moved from the "listed, high trust" back in April 2011 to simple "refuse" again because they answered and promised to fix the situation.
Comment 23 João Gouveia 2011-12-11 21:09:39 UTC
Matthias,

If it helps, I can offer our support by adding DNSWL to our public mirrors.
Comment 24 Kevin A. McGrail 2011-12-12 16:31:32 UTC
> The reason for the special result code, as indicated in the posting referenced
> above, is that REFUSED rcode will result in triple the amount of queries in
> most cases. 

In the absence of a patch to implement your special return value (which I think needs to be outside of 127.X and should be discussed with other RBLs), I can only recommend that you simply blackhole the requests from servers in excess of 100K that you consider abusive.

Additionally, as with Joao, I am also happy to support your project with a public nameserver.

However, I can't support your policy that causes FPs in SA as I feel it is unrealistic to launch an RBL and not expect this type of problem.  

As of today, DNSWL will be disabled by default in SA's rules.  SA Admins wishing to use it, should add something like this to your local.cf:

#ENABLING DNSWL - BUG 6668
score RCVD_IN_DNSWL_NONE 0 -0.0001 0 -0.0001
score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

This disabling will be effective with the next rules update.

However, please note that we are *very* open to discussing policy changes that will help maintain your project, it's success as a spam test and not cause FPs so that it could be re-enabled by default.

Regards,
KAM

svn commit -m 'Changing scores of DNSWL due to FPs caused by their nameservers anti-abuse policies - Bug 6668'
Sending        rules/50_scores.cf
Transmitting file data .
Committed revision 1213299.
Comment 25 Darxus 2011-12-12 17:27:10 UTC
Should these rules be put in a sandbox so they continue to be monitored?  They could also be left enabled with informational scores so reuse could be used, but I doubt that would be worthwhile.
Comment 26 Kevin A. McGrail 2011-12-12 17:36:21 UTC
(In reply to comment #25)
> Should these rules be put in a sandbox so they continue to be monitored?  They
> could also be left enabled with informational scores so reuse could be used,
> but I doubt that would be worthwhile.

I believe the rules have been in a sandbox since 3.3.0.  I am correct, they are in Theo's sandbox which is where they have been living for a while.

As of 3.3.0, I believe, we were publishing hand-generated scores that are higher than masscheck auto-determined.

Overall, the efficacy of DNSWL outside of the FP scores is well established and that's not a barrier to the re-enabling of the scores.

Regards,
KAM
Comment 27 Warren Togami 2011-12-12 19:41:17 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > Should these rules be put in a sandbox so they continue to be monitored?  They
> > could also be left enabled with informational scores so reuse could be used,
> > but I doubt that would be worthwhile.
> 
> I believe the rules have been in a sandbox since 3.3.0.  I am correct, they are
> in Theo's sandbox which is where they have been living for a while.
> 
> As of 3.3.0, I believe, we were publishing hand-generated scores that are
> higher than masscheck auto-determined.

We were publishing hand-generated scores for DNSWL long before 3.3.0.

http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
I lead the charge to manually reduce these hard-coded scores prior to 3.3.0 due to this issue.  Subsequently I went even further to suggest that we should reduce DNSWL and IADB scores even further as they don't seem to have automatic means of enforcement in place and we see consistent FP's in our tests.  More recently I suggested that we should set all whitelists to -0.01 informational during GA scoring as they have nothing to do with the performance of positive scoring rules and thus can improperly throw off the scoring.

> 
> Overall, the efficacy of DNSWL outside of the FP scores is well established
> and that's not a barrier to the re-enabling of the scores.

http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
Well established based on what?  Has anyone looked at the statistics to prove that this situation has not changed?
Comment 28 Matthias Leisi 2011-12-12 20:33:53 UTC
(In reply to comment #27)
> (In reply to comment #26)

> > Overall, the efficacy of DNSWL outside of the FP scores is well established
> > and that's not a barrier to the re-enabling of the scores.

It should be noted that the policy contested in this bug does not cause FPs. It does cause FNs for a small number of users (where other attempts to rectify an unaccepted situation failed). On the other hand, removing the rules will lead to a higher risk of FPs for 99.something % of users.
 
> http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
> Well established based on what?  Has anyone looked at the statistics to prove
> that this situation has not changed?

Well established based on http://www.chaosreigns.com/dnswl/. While the stats have a lot of fluctuation which makes it sometimes hard to interpret individual data points, it generally shows the "usefulness" of dnswl.org rules/data.
Comment 29 Matthias Leisi 2011-12-12 20:37:39 UTC
(In reply to comment #26)

> I believe the rules have been in a sandbox since 3.3.0.  I am correct, they are
> in Theo's sandbox which is where they have been living for a while.

Since 3.2.0 (with an error first), http://www.dnswl.org/news/archives/1-dnswl.org-data-and-SpamAssassin-3.2.0.html
Comment 30 Kevin A. McGrail 2011-12-12 21:12:31 UTC
> It should be noted that the policy contested in this bug does not cause FPs. It
> does cause FNs for a small number of users (where other attempts to rectify an
> unaccepted situation failed). On the other hand, removing the rules will lead
> to a higher risk of FPs for 99.something % of users.

To clarify, the DNSWL policy of returning positive answers to gain the attention of administrators with SA installations sending DNS queries to DNSWL through over-quota IPs causes misfiring on the DNSWL Rules.  It's a FP on the Rule regardless of the negative or positive score the rule applies.  

The scoring effect (FP/FN or even a neutral) on the status of the email is another discussion.  

However, I agree, that a FP on a negative scoring rule is likely to cause a FN on an email that is spam.

Blackhole the requests instead or add more public NS to protect the infrascture.  

And in case you missed it, two well-experienced RBL infrastructures (including myself) have offered to help with more public nameservers.

Regards,
KAM
Comment 31 Warren Togami 2011-12-12 22:03:48 UTC
Matthias, I join in asking you to please reconsider your approach to misuse prevention.  Causing FN's is hardly an effective means at making the sysadmin take notice, as they aren't losing any important mail like FP's would cause.

Blackholing is a superior approach as it causes DNS timeout delays in mail delivery, which legitimately causes problems for the sysadmin and is more likely to cause them to take notice that there is a problem.

Also, I too have the ability to host a high capacity mirror of DNSWL.  Would you allow us to help?
Comment 32 Steve Freegard 2011-12-12 23:49:14 UTC
Just to add my 2c....

KAM:  This was discussed before with regards to URIBL doing the same stuff, see https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6048

URIBL still has the ability to return 127.0.0.255 for all queries as per their 'abuse' page.  See http://uribl.com/about.shtml#abuse and were returning positive for queries from Google DNS about 3-4 weeks ago (AXB can probably confirm this).

Personally I think it would be a shame to loose either list from the default rulesets despite these practices.
Comment 33 Kevin A. McGrail 2011-12-13 00:08:32 UTC
(In reply to comment #32)
> Just to add my 2c....
> 
> KAM:  This was discussed before with regards to URIBL doing the same stuff, see
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6048
> 
> URIBL still has the ability to return 127.0.0.255 for all queries as per their
> 'abuse' page.  See http://uribl.com/about.shtml#abuse and were returning
> positive for queries from Google DNS about 3-4 weeks ago (AXB can probably
> confirm this).
> 
> Personally I think it would be a shame to loose either list from the default
> rulesets despite these practices.

It's good to mention this because we need to implement it the same for URIBL.  My understanding back like 2 years ago was that URIBL changed to a block of the query and not to return false positives.

I can tell you that I have nothing on my public NS for URIBL that gives out FP answers.  I do have the rbldnsd ACL implemented which I believe does interfere but only in a blocking/pretend there is no data way.

Blocking/pretending no data for queries is considered acceptable, I believe.

AXB, can you confirm otherwise?
Comment 34 Steve Freegard 2011-12-13 11:46:39 UTC
(In reply to comment #33)
> I can tell you that I have nothing on my public NS for URIBL that gives out FP
> answers.  I do have the rbldnsd ACL implemented which I believe does interfere
> but only in a blocking/pretend there is no data way.

As the web site says - it uses 'Split Horizon' to do this, so the mirrors wouldn't see who where being blocked and when as it's being done upstream by supplying different NS records to blocked senders, which in turn return the positive replies.
Comment 35 Kevin A. McGrail 2011-12-13 12:09:15 UTC
(In reply to comment #34)
> (In reply to comment #33)
> > I can tell you that I have nothing on my public NS for URIBL that gives out FP
> > answers.  I do have the rbldnsd ACL implemented which I believe does interfere
> > but only in a blocking/pretend there is no data way.
> 
> As the web site says - it uses 'Split Horizon' to do this, so the mirrors
> wouldn't see who where being blocked and when as it's being done upstream by
> supplying different NS records to blocked senders, which in turn return the
> positive replies.

I've asked the URIBL admins to comment.  But please open a different bug on the URIBL issue.  You've sort of hi-jacked this DNSWL bug.
Comment 36 drmres 2011-12-14 03:36:28 UTC
Please note: Comments by Darxus should be disregarded in this report as he is an active DNSWL admin with direct personal gain interest in this issue.
Comment 37 Darxus 2012-09-09 14:55:38 UTC
(In reply to comment #36)
> Please note: Comments by Darxus should be disregarded in this report as he
> is an active DNSWL admin with direct personal gain interest in this issue.

I have no "direct personal gain interest in this issue."  My relationship with dnswl.org was fully disclosed to everyone long ago.  It's no different from my relationship with spamassassin.