Bug 6048 - URIBL.com tests should be removed from the default SpamAssassin rulesets
Summary: URIBL.com tests should be removed from the default SpamAssassin rulesets
Status: RESOLVED WORKSFORME
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: unspecified
Hardware: All All
: P3 normal
Target Milestone: 3.2.5
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-21 15:10 UTC by Steve Freegard
Modified: 2009-02-26 15:54 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Steve Freegard 2009-01-21 15:10:20 UTC
As URIBL.com are the only blacklist to return *positive* results for *all queries* when a nameserver has been blocked from querying the list - I think the rules should be removed from the default rulesets. 

For example - I just brought up a host on Amazon EC2:

[root@mail ~]# host -t TXT 2.0.0.127.black.uribl.com
2.0.0.127.black.uribl.com descriptive text "174.129.177.176 has been block due to excessive queries."

Unfortunately I didn't check this first - subsequently I therefore had a score of 2.206 added to every mail that contained a URI (because all three lists returned positive) and didn't notice this until I had a load of FPs reported.

For a new or inexperienced SpamAssassin user this would give a bad impression of the software and will most cause numerous questions to the users lists, tickets raised here or they might stop using SA altogether.

I understand their reasons for returning positive results (to force people to use a local cache and/or to pay for use) - however it doesn't make it the right thing to do particularly when URIBL_BLACK, URIBL_GREY and URIBL_RED are in the default SpamAssassin rulesets - it should be removed and the end-user should make the choice to add them in if they want them.

Kind regards,
Steve.
Comment 1 Matt Kettler 2009-01-21 18:58:29 UTC
Better would be to add a rule that detects this situation.

AFAIK, URIBL returns 127.0.0.255 for all high-query loaders, so something like this might work: (warning - untested theoretical example)


urirhssub	URIBL_SERVERBLOCKED	multi.uribl.com.        A   255
body		URIBL_SERVERBLOCKED	eval:check_uridnsbl('URIBL_BLACK')
describe	URIBL_SERVERBLOCKED	Local nameserver denied URIBL service
tflags		URIBL_SERVERBLOCKED	net
#reuse		URIBL_SERVERBLOCKED

Probably need to co-ordinate with Alex and company.
Comment 2 AXB 2009-01-21 23:15:28 UTC
1. If WHOIS contact address is correct/reachable, hosts get warned befoe they get blocked.

2. Inexperienced SA users will hardly hit URIBL.com public mirror infrastructure with hundreds of thousands/millions of queries/day, and get blocked without a warning.

3. You could have used some other NS for your queries.

4. You obviously didn't test your system thoroughly before going live, otherwise this would have jumped out real fast.

Matt's solution suggestion breaks the URIBL.com's intention of warning the admin that he's abusing the mirrors, even AFTER being contacted and the query hammering would continue.

If this would happen very widescale, it would result in URIBL.com's public mirrors get taken down and become a pay-per-use service only, harldy in interest of the wider user base.


Alex

Comment 3 Steve Freegard 2009-01-22 01:47:47 UTC
(In reply to comment #2)
> 1. If WHOIS contact address is correct/reachable, hosts get warned befoe they
> get blocked.

Yes - but in the vast majority of cases; you'll be notifying some ISP and not their end-users that are probably using their DNS server.

> 2. Inexperienced SA users will hardly hit URIBL.com public mirror
> infrastructure with hundreds of thousands/millions of queries/day, and get
> blocked without a warning.

Ok - so you haven't blocked the DNS servers of most major providers already then?  Including those dished out to users via the ISPs DHCP scope?

You're thinking a bit one dimensionally - not everyone is using SA on UNIX gateway.  What about SAProxy32 users who are running SA on Windows; they'll be doing URI lookups too and they won't be able to just install their own nameserver. 

> 3. You could have used some other NS for your queries.

Yes - I should have.  But it got me thinking more about the ramifications of this for new installs - if we go by what you are saying; then the SA installation instructions need to be changed to get users to either 1) install a local nameserver cache and/or 2) make sure they aren't blocked.

You're also ignoring the many people which will have their own DNS servers but use their ISPs nameserver cache as forwarders - which is also a common configuration.

> 4. You obviously didn't test your system thoroughly before going live,
> otherwise this would have jumped out real fast.

Granted - doubt that I'm alone in being bitten by this however.  But that's not the point.

> Matt's solution suggestion breaks the URIBL.com's intention of warning the
> admin that he's abusing the mirrors, even AFTER being contacted and the query
> hammering would continue.

Matt's solution is workable if the rules are to remain on in the default configuration.

> If this would happen very widescale, it would result in URIBL.com's public
> mirrors get taken down and become a pay-per-use service only, harldy in
> interest of the wider user base.

And returning a positive result for all lookups is designed to intentionally harm the users results so they notice quickly is pretty unfriendly too especially when your rules are default on.

It would be far more sensible to actually firewall port 53 from these IP ranges so that it causes timeouts instead - that would be a far better way to get people to notice without the collateral damage.

Regards,
Steve.
Comment 4 Anthony Howe 2009-01-22 01:59:04 UTC
I strongly disagree with AXB assessment. 

1. No longer possible to rely on WHOIS info. to contact host operators given
privacy issues. Some legit sites may choose to use anonyminisers or use a 3rd
party contact like lawyers, the host's ISP or data centre provider. Therefore
the chain by which uribl.com notifications might be made is too easily broken
such that the host operator may not get sufficient warning or none at all due
to a break in communication.

2. An inexperienced SA user could initially start out as low volume, but
increase over time to high volume. They may still remain ignorant or
inexperienced WRT SA and as already noted, they might not receive the notice of
being blocked, maybe because the notice itself was filtered or rejected.

3. Using some other NS assumes that the system builder is the same as the
system operator and is aware that the host being blocked. Often data centres
and other 3rd parties with build machines made to order, without knowledge of
the previous history of the host and its operators.

4. If the system is high volume, then detecting and determining what is wrong
may not be within the system operator's abilities (eg. inexperienced operator).
The system may have been built by 3rd party and tested with different domains,
IPs, and DNS servers only to be changed on delivery to their customer.

Therefore one cannot assume that system operator will have received the
uribl.com warning, know what to do to correct the problem if they receive such
a notice. 

If uribl.com can identify a host as high volume so as to change the results
based on that host's IP address, then they could as easily just blacklist or
drop the request and NOT return a result that would do the mail server harm.
Comment 5 Oli Schacher 2009-01-22 02:14:06 UTC
This is becoming a discussion about URIBL's policies instead of a SA bugreport. What are the intentions behind SA's default configuration? Is it a good default for the majority of users (which includes URIBL) or is it a good default for everyone (which excludes URIBL) ?
Comment 6 AXB 2009-01-22 02:21:00 UTC
(In reply to comment #4)
> I strongly disagree with AXB assessment. 
> 
> 1. No longer possible to rely on WHOIS info. to contact host operators given
> privacy issues. Some legit sites may choose to use anonyminisers or use a 3rd
> party contact like lawyers, the host's ISP or data centre provider. Therefore
> the chain by which uribl.com notifications might be made is too easily broken
> such that the host operator may not get sufficient warning or none at all due
> to a break in communication.
> 
> 2. An inexperienced SA user could initially start out as low volume, but
> increase over time to high volume. They may still remain ignorant or
> inexperienced WRT SA and as already noted, they might not receive the notice of
> being blocked, maybe because the notice itself was filtered or rejected.
> 
> 3. Using some other NS assumes that the system builder is the same as the
> system operator and is aware that the host being blocked. Often data centres
> and other 3rd parties with build machines made to order, without knowledge of
> the previous history of the host and its operators.
> 
> 4. If the system is high volume, then detecting and determining what is wrong
> may not be within the system operator's abilities (eg. inexperienced operator).
> The system may have been built by 3rd party and tested with different domains,
> IPs, and DNS servers only to be changed on delivery to their customer.
> 
> Therefore one cannot assume that system operator will have received the
> uribl.com warning, know what to do to correct the problem if they receive such
> a notice. 
> 
> If uribl.com can identify a host as high volume so as to change the results
> based on that host's IP address, then they could as easily just blacklist or
> drop the request and NOT return a result that would do the mail server harm.

Uribl.com has uses these options and uses regularly.
When the abuser never stops or hits skyhigh levels, then the positive reply is applied.
Its not the generic/default case.


Comment 7 Anthony Howe 2009-01-22 02:37:45 UTC
>> If uribl.com can identify a host as high volume so as to change the results
>> based on that host's IP address, then they could as easily just blacklist or
>> drop the request and NOT return a result that would do the mail server harm.

> Uribl.com has uses these options and uses regularly.
> When the abuser never stops or hits skyhigh levels, then the positive reply is
> applied.
> Its not the generic/default case.

The above makes no sense. Blacklist in my comment means reject the IP, do NOT return a positive result for ALL queries, as this does harm. Dropping the packet would be better, because then the server sees lots of timeouts that might delay mail, but not cause it to be lost or filtered.

In response to Oli Schacher:

Knowing uribl.com policies or negative behaviour and the impact on SA users is important. I believe that enabling URIBL_* rules by default is BAD for everyone. If the user is sufficiently capable to enable them themselves, then they are sufficiently capable to deal with future fallout if necessary.
Comment 8 AXB 2009-01-22 02:58:32 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > 1. If WHOIS contact address is correct/reachable, hosts get warned befoe they
> > get blocked.
> 
> Yes - but in the vast majority of cases; you'll be notifying some ISP and not
> their end-users that are probably using their DNS server.

Possibly and there's no way around it.
End users hardly do 1.5million  queries/day, do they?
What NS they take is not our concern.
 
> > 2. Inexperienced SA users will hardly hit URIBL.com public mirror
> > infrastructure with hundreds of thousands/millions of queries/day, and get
> > blocked without a warning.
> 
> Ok - so you haven't blocked the DNS servers of most major providers already
> then?  Including those dished out to users via the ISPs DHCP scope?

Nope - DHCP scopes don't include heavy hitters - nobody sensible will use a hi traffic MTA on a DHCP assigned IP.
Amazingly, "major ISPs" haven't been a concern or a problem.
(whoever may be meant)


> You're thinking a bit one dimensionally - not everyone is using SA on UNIX
> gateway.  What about SAProxy32 users who are running SA on Windows; they'll be
> doing URI lookups too and they won't be able to just install their own
> nameserver. 

somebody using SAproxy doesn't send milliosn of queries/day, EVER!
And if so, even Windows boxes can use local recursors... here's several out there, for free.


> > 3. You could have used some other NS for your queries.
> 
> Yes - I should have.  But it got me thinking more about the ramifications of
> this for new installs - if we go by what you are saying; then the SA
> installation instructions need to be changed to get users to either 1) install
> a local nameserver cache and/or 2) make sure they aren't blocked.

Its common practice that hi traffic MTAs use local recursors, if somebody doesn't he lives with the consequences (eg: Spamhaus blocks)

> You're also ignoring the many people which will have their own DNS servers but
> use their ISPs nameserver cache as forwarders - which is also a common
> configuration.

SMEs do that, and these are hardly hi traffic. 
We're talking about corps/ISPs/vendors with hi traffic boxes, not the average 25 user SME box.
Uribl.com cannot be made responsible for lack of user's skills.

> > 4. You obviously didn't test your system thoroughly before going live,
> > otherwise this would have jumped out real fast.
> 
> Granted - doubt that I'm alone in being bitten by this however.  But that's not
> the point.

The point is that you've been caught TWICE by the same issue and now you are offloading your repeated hurt to URIBL instead of kicking yourself in the butt :-)

> > Matt's solution suggestion breaks the URIBL.com's intention of warning the
> > admin that he's abusing the mirrors, even AFTER being contacted and the query
> > hammering would continue.
> 
> Matt's solution is workable if the rules are to remain on in the default
> configuration.
> 
> > If this would happen very widescale, it would result in URIBL.com's public
> > mirrors get taken down and become a pay-per-use service only, harldy in
> > interest of the wider user base.
> 
> And returning a positive result for all lookups is designed to intentionally
> harm the users results so they notice quickly is pretty unfriendly too
> especially when your rules are default on.

it isn't - its an effective FAST way to tell the user something is VERY wrong.
If someone gets the positive replies, he's been pushing it too far, for too long.

> It would be far more sensible to actually firewall port 53 from these IP ranges
> so that it causes timeouts instead - that would be a far better way to get
> people to notice without the collateral damage.

Full queues are often harder to debug, a hard fail makes the point very efficiently, which is our last resource to stop abuse and keep donated public mirrors alive.

Bottom line is that you as an appliance/services vendor should get a datafeed for your customers/services and not rely on public mirror infrastructure for your business, to your user's benefit, have them query your rbldnsd instances for all BLs you provide per default.

Steve, I consider you a friend, but your rant is not an SA issue.
Its your personal hurt and it has little to do with the hundres of thousands of SA setups happily querying URIBL.com and other BLs and not getting blocked.

Alex
Comment 9 Steve Freegard 2009-01-22 03:27:17 UTC
(In reply to comment #8)
 > Bottom line is that you as an appliance/services vendor should get a datafeed
> for your customers/services and not rely on public mirror infrastructure for
> your business, to your user's benefit, have them query your rbldnsd instances
> for all BLs you provide per default.
> 
> Steve, I consider you a friend, but your rant is not an SA issue.
> Its your personal hurt and it has little to do with the hundres of thousands of
> SA setups happily querying URIBL.com and other BLs and not getting blocked.

Whoa! You've got me completely wrong here.  I do not have any ulterior motives.  It was with careful consideration that I raised this ticket.  I didn't want it to turn into a flame-war or to start flinging accusations around....

The whole point of the ticket isn't to debate URIBL policies on how the list is handled - that is entirely up to the list admins and the mirrors.  It's as to whether the SA devs feel that URIBL is appropriate to be enabled by default based upon the practice of returning positive results for all queries in some cases whatever they may be as they are not within the control or knowledge of the SA devs.

Regards,
Steve.
Comment 10 Anthony Howe 2009-01-22 03:39:56 UTC
This ticket is digressing into a mud slinging match. The issue of using public vs. commercial lists is irrelevant here. The issue is how uribl.com current query limit policies (blacklist, drop, always positive, nose pick, etc.) can effect SA users negatively should they ever exceed them. As they stand now, if a host makes excessive queries and uribl.com on some internal no-service list, then all the host queries return "always positive" results that score excessively high against a scale of 5, which in turn can cause mail to be pushed over the threshold that much more easily. 

SA should consider one or more of the following actions:

a) reconsider making URIBL_* rules default on, by making them default off, allowing capable users the choice to use them and their potential future fallout.

b) add code to somehow detect the "no-service" situation and subsequently discontinue querying uribl.com

c) give a much lower score to URIBL_* rules.
Comment 11 Matt Kettler 2009-01-22 03:45:39 UTC
Alex, I take a bit of exception to this statement:

"Matt's solution suggestion breaks the URIBL.com's intention of warning the
admin that he's abusing the mirrors, even AFTER being contacted and the query
hammering would continue."

My solution doesn't break that intention, it makes it clearer that they are being blocked by URIBL.

I'm not suggesting that this over-ride the scores of URIBL_BLACK, or otherwise disable that rule. I'm suggesting it be added so admins quickly know why URIBL_BLACK starts matching all messages.

In fact, URIBL_SERVERBLOCKED probably should have a quite significant score on its own. (ie: +20)

So a hypothetical system which is blocked would start seeing both URIBL_BLACK and URIBL_SERVERBLOCKED match. At that point it should, theoretically, be very clear to them why it's going on.

In the current situation, they'll see URIBL_BLACK match all messages, but there's nothing telling them why.

Let's work together on a solution that does make it abundantly clear. This will be better for URIBL, as you'll get quicker reactions from the folks your blocking (if they're using SpamAssassin).

Of course, we'll have to deal with "Why doesn't it just offset the score of URIBL_BLACK so my mail keeps working" type questions, but wouldn't work as the intent of the rule is to raise awareness, not dull the pain.





Comment 12 Matt Kettler 2009-01-22 03:53:51 UTC
> SA should consider one or more of the following actions:
> 
> a) reconsider making URIBL_* rules default on, by making them default off,
> allowing capable users the choice to use them and their potential future
> fallout.
> 
> b) add code to somehow detect the "no-service" situation and subsequently
> discontinue querying uribl.com
> 
> c) give a much lower score to URIBL_* rules.
> 

I've already proposed:

d) add rules that detect the "no-service" situation and subsequently increase attention so the admins can take their own action to stop querying it.

And I think that's a valid option.

b) would work too, but that's a bit tricky since SpamAssassin would loose this information every time a new instance was created, and have to relearn it. That might throttle the problem a bit, reducing the query load to only happen once each time a new spamd child is created, but it wouldn't solve it for Alex.

a) has already been discussed at length, and as far as I'm concerned, it is off the table. We've discussed the whole "free for most but not high load" topic at length regarding both Uribl and SpamHaus. No service, free or paid, can withstand infinite load, so all services need a cutoff somewhere, written or otherwise. One mans daily traffic is another mans DoS attack.

c) defeats the purpose of using URIBL in SA in the first place.
Comment 13 Anthony Howe 2009-01-22 03:59:27 UTC
Matt,

Problem with your solution is should uribl.com neglect to return 127.0.0.255 (fat or forgetful fingers) or deliberately choose to return 127.0.0.2 (or other) to circumvent your URIBL_SERVERBLOCKED, then the SA rules will continue to behave in such a way as to do the mail server harm.
Comment 14 Anthony Howe 2009-01-22 04:10:44 UTC
Matt,

Not sure I like the idea of a URIBL_SERVERBLOCKED creating an even higher score. Having the token appear in logs and/or headers is fine, you're assuming though that the system operator will be made aware of the issue quickly. 

Using milter-spamc or BarricadeMX, the recipient would never know unless they read the message headers. The SA reports are never added to the message body, only headers. The sender may or may not get a DSN with the report summary as some sites might choose to not to disclose the information out of fear it might aid spammers.

So it isn't clear that a really high score will get administrator attention sufficiently quickly. Better to disable queries to the list and proceed with other SA tests, there are certainly enough other ones to fall back on.
Comment 15 Anthony Howe 2009-01-22 04:22:31 UTC
> b) would work too, but that's a bit tricky since SpamAssassin would loose this
> information every time a new instance was created, and have to relearn it. That
> might throttle the problem a bit, reducing the query load to only happen once
> each time a new spamd child is created, but it wouldn't solve it for Alex.

Surely there must be some way for an SA child to return information to the parent via shared memory or a pipe, so that the SA parent could then remove uribl.com before spawning new instances.

Comment 16 Justin Mason 2009-01-22 04:38:50 UTC
each child is stateless in that respect; this is a design choice, since it allows "farms" of ~independent spamd instances.  shared state is done via on-disk persistent files, or at scale, SQL databases.  so it's definitely non-trivial and unlikely in this case.
Comment 17 Matt Kettler 2009-01-22 05:25:28 UTC
(In reply to comment #14)

> So it isn't clear that a really high score will get administrator attention
> sufficiently quickly. 

If all mail being tagged doesn't get their attention quickly, I'm not sure what will. I can't think of anything SA can do that would get attention faster.

Personally, this is also a lot better than the current situation, where merely "lots more than normal" gets tagged. That's a lot more subtle than a 100% cutoff. That tends to get attention rather slowly, and the problem could persist for weeks before being detected.

Perhaps a score lower than 20 might be better, due to some folks doing auto-delete on high scores. However, my basic point is that the rule should not be scored negative or otherwise offset URIBL_BLACK. 

I suspect Alex believed I was suggesting a negative scoring compensation rule to avoid the mail being tagged. That is not the case, and if it has any nonzero score, it should be positive to make the situation more visible.


>Better to disable queries to the list and proceed with
> other SA tests, there are certainly enough other ones to fall back on.

Disabling may be better, but it's not going to be easy to implement in SpamAssassin's architecture.

Of course, patches welcome if you want to implement that solution.




Comment 18 AXB 2009-01-22 05:31:07 UTC
(In reply to comment #17)
> (In reply to comment #14)
> 
> > So it isn't clear that a really high score will get administrator attention
> > sufficiently quickly. 
> 
> If all mail being tagged doesn't get their attention quickly, I'm not sure what
> will. I can't think of anything SA can do that would get attention faster.
> 
> Personally, this is also a lot better than the current situation, where merely
> "lots more than normal" gets tagged. That's a lot more subtle than a 100%
> cutoff. That tends to get attention rather slowly, and the problem could
> persist for weeks before being detected.
> 
> Perhaps a score lower than 20 might be better, due to some folks doing
> auto-delete on high scores. However, my basic point is that the rule should not
> be scored negative or otherwise offset URIBL_BLACK. 
> 
> I suspect Alex believed I was suggesting a negative scoring compensation rule
> to avoid the mail being tagged. 

Yep.. that was was I (mis)understood.

> That is not the case, and if it has any nonzero
> score, it should be positive to make the situation more visible.

Which is the point/desired effect.




Comment 19 Justin Mason 2009-01-22 07:35:27 UTC
(In reply to comment #17)
> (In reply to comment #14)
> 
> > So it isn't clear that a really high score will get administrator attention
> > sufficiently quickly. 
> 
> If all mail being tagged doesn't get their attention quickly, I'm not sure what
> will. I can't think of anything SA can do that would get attention faster.

I don't know if that is appropriate at all.  Bear in mind that in some situations, a sufficiently high score would result in the mail being bounced!  If going over a URIBL query limit results in all mail coming in to your site bouncing, that's a very serious problem. :(

I would be in favour of a well-known test endpoint: "blocked.multi.uribl.com". for most queries that would return "0.0.0.0" with a long TTL.  for sites blocked due to too many queries, that would return "255.255.255.255" with a long TTL.  These are very cacheable and would be extremely low-load.  This provides a way for clients like SA to query and determine if a caller is overloading the servers; in that situation we can issue warnings, fire informational rules, log stuff to the syslogs etc.
Comment 20 Steve Freegard 2009-01-22 07:51:22 UTC
(In reply to comment #19)
> (In reply to comment #17)
> > (In reply to comment #14)
> > 
> > > So it isn't clear that a really high score will get administrator attention
> > > sufficiently quickly. 
> > 
> > If all mail being tagged doesn't get their attention quickly, I'm not sure what
> > will. I can't think of anything SA can do that would get attention faster.
> 
> I don't know if that is appropriate at all.  Bear in mind that in some
> situations, a sufficiently high score would result in the mail being bounced! 
> If going over a URIBL query limit results in all mail coming in to your site
> bouncing, that's a very serious problem. :(

Yes - this is exactly the reason I raised this bug.  The behaviour is unique to URIBL and I have AlexB's assurances that deliberate positive results for all queries is rarer that I think it is (as I've been hit by this collateral damage several times now).
 
> I would be in favour of a well-known test endpoint: "blocked.multi.uribl.com".
> for most queries that would return "0.0.0.0" with a long TTL.  for sites
> blocked due to too many queries, that would return "255.255.255.255" with a
> long TTL.  These are very cacheable and would be extremely low-load.  This
> provides a way for clients like SA to query and determine if a caller is
> overloading the servers; in that situation we can issue warnings, fire
> informational rules, log stuff to the syslogs etc.

There is already a BCP proposal for this:

http://www.ietf.org/internet-drafts/draft-irtf-asrg-bcp-blacklists-05.txt

   Note: In Section 3.4 it is noted that some DNSBLs have shut down in
   such a way to list all of the Internet.  Further, in Section 3.5,
   DNSBL operators MUST NOT list 127.0.0.1.  Therefore, a positive
   listing for 127.0.0.1 SHOULD be interpretable as an indicator that
   the DNSBL has started listing the world and is non-functional.

Although this paragraph is about shutting down a DNSBL; it is in essence exactly what URIBL are trying to achieve on a querying IP level - so I believe the same applies.
Comment 21 Anthony Howe 2009-01-22 08:33:55 UTC
The idea of querying 127.0.0.1 as a means of identifying whether the list is dead or you're banned is nice. However, polling for it once in a while seems problematic for SA (maybe less so in other clients).

So I have a question. RFC 1035 allows the query packet question section to contain more than one question (in theory). So would it not be possible to place BOTH the query, be it domain or IP, and the 127.0.0.1 question into the question section of the same packet? This would optimise the DNS queries. Only question is do DNS servers support multiple questions in a single packet as implied by RFC 1035.

Comment 22 Jeff Chan 2009-01-22 09:57:47 UTC
All of this is a lot of inappropriate hand waving for a technical
solution to what is a policy problem with the list.  Tagging all
messages is not much different from (monkeys?) blacklisting the
entire Internet as a way to get attention.  Both are bad list
policies.

The list would be much better off blocking DNS queries than
returning a deliberately misleading result.

I realize this is probably not an appropriate response for a bug
ticket, but it's probably unwise to spend a lot of work on a
technical solution to a problem that's fundamentally an error of
policy.  Despite that, I hope this observation is useful feedback
which results in constructive solutions.
Comment 23 Dallas Engelken 2009-01-22 11:03:50 UTC
(In reply to comment #3)
> (In reply to comment #2)
> 
> It would be far more sensible to actually firewall port 53 from these IP ranges
> so that it causes timeouts instead - that would be a far better way to get
> people to notice without the collateral damage.
> 

We have no management of most of the mirrors as they are set up my the owner, and many of them are not just serving zones for uribl, so filtering heavy users from querying *.uribl.com at the packet level is not possible.

rbldnsd acl's actually has an 'ignore' option which is the next closest thing to packet level filtering, and we initially went with that option.  Shortly after we found the mirrors had a 300% increase in traffic, as the non-response actually caused a client side timeout and the dns retry features in the resolver code caused resends of the query multiple times.

So we've settled on the 'empty' option, which results in NXDOMAIN being returned to all queries.   We also make every attempt to notify the end user.   If no action is taken, only then would it change to a positive response.  We have over 40k unique IPs hitting our mirrors, and just 120 positive ACLs for the heaviest users who never took action on the negative ACL.

I'm okay with whatever SA wants to do.   I dont think URIBL ACL policy needs to change.  With the public DNS infastructure we have, I dont see  any other effective way to stem the abuse,  unless we take all the donated public mirrors offline and only serve mirrors which are controlled by us.  Then we could put packet filtering in place.   If we did that, I know there are some donated mirrors that would be upset to loose those public queries.  Cant make everyone happy I suppose.

D

Comment 24 Steve Freegard 2009-01-22 11:48:09 UTC
(In reply to comment #23)

Thanks for the technical explanation.

> I'm okay with whatever SA wants to do.   I dont think URIBL ACL policy needs to
> change.  With the public DNS infastructure we have, I dont see  any other
> effective way to stem the abuse.

Just an idea - unless I'm missing something - why don't you simply move the ACLs up a level instead to the uribl.com zone so that if you blacklist an IP then it prevents the IP address from being able to query the NS records for black/grey/red.uribl.com (e.g. the nameserver returns 'REFUSED'; although NXDOMAIN might work better for negative caching) granted you'll have to wait up to 24 hours before the host will actually be prevented from querying; but it would still do what you need it to.  That way the traffic stops dead at Prolexic nameservers instead of the public mirrors and everyone's happy.
Comment 25 Dallas Engelken 2009-01-22 11:58:46 UTC
(In reply to comment #24)
> (In reply to comment #23)
> 
> Thanks for the technical explanation.
> 
> > I'm okay with whatever SA wants to do.   I dont think URIBL ACL policy needs to
> > change.  With the public DNS infastructure we have, I dont see  any other
> > effective way to stem the abuse.
> 
> Just an idea - unless I'm missing something - why don't you simply move the
> ACLs up a level instead to the uribl.com zone so that if you blacklist an IP
> then it prevents the IP address from being able to query the NS records for
> black/grey/red.uribl.com (e.g. the nameserver returns 'REFUSED'; although
> NXDOMAIN might work better for negative caching) granted you'll have to wait up
> to 24 hours before the host will actually be prevented from querying; but it
> would still do what you need it to.  That way the traffic stops dead at
> Prolexic nameservers instead of the public mirrors and everyone's happy.
> 


I've asked Prolexic this in the past and they say they cant do it as the ACLs apply globablly, and it would be static entries and a named reload anyhow, so no way for us to manage or automate that flow.

We would have to move our primary nameservers off prolexic into hardware we can manage ourselves, which we have discussed doing.

Comment 26 Anthony Howe 2009-01-22 12:28:52 UTC
Dallas,

(In reply to comment #25)
> I've asked Prolexic this in the past and they say they cant do it as the ACLs
> apply globablly, and it would be static entries and a named reload anyhow, so
> no way for us to manage or automate that flow.

I don't understand how they could NOT provide ACLs that work on a source/destination combination. My understanding of most firewalls and packet filters allows for source, destination, or the pair along with additional parameters like port, protocol, sex, etc. to be blended into packet filter rules. Given you choose this data centre host for their abilities to prevent DDoS, which should include sufficient means to block source/destination pairs on demand of the client (or provide a client interface), they should be able to provide service at their gateways. 

From what I understand here, the level of uribl.com's service with their provider is insufficient to uribl.com's function. And surely they know what you are doing such that they should be able to help aid you in DDoS issues and high volume disputes through some form of ACL or automated firewall rule creation. 

To me it sounds like your data centre is not capable or unwilling to provide the level of service required to allow custom(er) rules. 
Comment 27 Dallas Engelken 2009-01-22 12:40:11 UTC
(In reply to comment #26)
> Dallas,
> 
> (In reply to comment #25)
> > I've asked Prolexic this in the past and they say they cant do it as the ACLs
> > apply globablly, and it would be static entries and a named reload anyhow, so
> > no way for us to manage or automate that flow.
> 
> I don't understand how they could NOT provide ACLs that work on a
> source/destination combination. My understanding of most firewalls and packet
> filters allows for source, destination, or the pair along with additional
> parameters like port, protocol, sex, etc. to be blended into packet filter
> rules. Given you choose this data centre host for their abilities to prevent
> DDoS, which should include sufficient means to block source/destination pairs
> on demand of the client (or provide a client interface), they should be able to
> provide service at their gateways. 
> 
> From what I understand here, the level of uribl.com's service with their
> provider is insufficient to uribl.com's function. And surely they know what you
> are doing such that they should be able to help aid you in DDoS issues and high
> volume disputes through some form of ACL or automated firewall rule creation. 
> 
> To me it sounds like your data centre is not capable or unwilling to provide
> the level of service required to allow custom(er) rules. 
> 


I'll be the last to throw Prolexic under the bus,  seeing that when we were getting drilled by several 400+ mbit/s ddos last year they stepped up to help us for free.

So I suppose, "You get what you pay for", right?

Comment 28 Matthias Leisi 2009-01-23 05:39:59 UTC
(In reply to comment #23)

> We have no management of most of the mirrors as they are set up my the owner,
> and many of them are not just serving zones for uribl, so filtering heavy users
> from querying *.uribl.com at the packet level is not possible.

Just for the record - at dnswl.org we have similar issues. We do not operate most of the nameserver resources donated to us ourselves -- these servers are generally managed by the bodies donating the resource to us (and more often than not, use these server for multiple DNSxLs, both public and private). 

Sometimes, we manage to make contact with high volume[*] users, but other times they can either not be found, or just ignore our emails. 

[*] Which we define as "> 100'000 queries/24 hours". 
Comment 29 Justin Mason 2009-01-23 06:51:21 UTC
this is getting OT.  but we could use the "admin's email address" field that
we set during Makefile.PL, and periodically (or randomly) perform a DNS query
to "jm.at.jmason.org.admin-address.multi.uribl.com", hence notifying the URIBL of the admin address for that querying IP ;)
Comment 30 Anthony Howe 2009-01-23 11:32:47 UTC
You (In reply to comment #29)
> this is getting OT.  but we could use the "admin's email address" field that

Half this ticket is off-topic, what's a little more.

You neglect package installs, which do not pose the question concerning admin address.

> we set during Makefile.PL, and periodically (or randomly) perform a DNS query
> to "jm.at.jmason.org.admin-address.multi.uribl.com", hence notifying the URIBL
> of the admin address for that querying IP ;)

Meh.

I think Steve Freegard's suggestion testing for 127.0.0.1 based on the Internet draft for DNSBL BCP is better. Your suggestion would have to work it's way into the Internet draft to make sure all lists adopted it. The 127.0.0.1 test is workable now IMO.



Comment 31 Sidney Markowitz 2009-01-23 11:53:16 UTC
(In reply to comment #29 and comment #30)

If the problem is a small percentage of installations that are super high volume and not getting or ignoring attempts to reach them, I don't see how they can count on the admin email address being set correctly or the admin responding to the email. And that still leaves the policy question at the heart of this bug report, which is do we allow a default enabled setting of a RBL that has this particular policy.

Regarding the 127.0.0.1 suggestion, given that each child process is independent and all the DNS queries are sent out at once to be checked asynchronously as the replies come in, how would that work? Won't we have to have an additional 127.0.0.1 query per RBL and then wait for the reply before sending the actual queries, and wouldn't that make every process have to wait the extra time for the replies to that query, and wouldn't that only reduce the load from those very high volume users by a factor of the the average number of URLs queried per message (because they still get one 127.0.0.1 query per message) which might not be enough to solve the problem from them anyway? Or is that last issue not a problem because you can make the response to 127.0.0.1 have a very large TTL?
Comment 32 Anthony Howe 2009-01-23 12:04:40 UTC
(In reply to comment #31)
> (In reply to comment #29 and comment #30)
> Regarding the 127.0.0.1 suggestion, given that each child process is
> independent and all the DNS queries are sent out at once to be checked
> asynchronously as the replies come in, how would that work? Won't we have to
> have an additional 127.0.0.1 query per RBL and then wait for the reply before
> sending the actual queries, and wouldn't that make every process have to wait
> the extra time for the replies to that query, and wouldn't that only reduce the
> load from those very high volume users by a factor of the the average number of
> URLs queried per message (because they still get one 127.0.0.1 query per
> message) which might not be enough to solve the problem from them anyway? Or is
> that last issue not a problem because you can make the response to 127.0.0.1
> have a very large TTL?

As I mentioned before, DNS allows for multiple questions in a single query. So one could query both "some.domain.black.uribl.com" and "1.0.0.127.black.uribl.com" in the same packet. If you get a result back that includes the 127.0.0.1 check, then you drop a file into the spamassassin local config dir or into /var/tmp like uribl.com.OFF or some such. Each child can check for a blacklist .OFF file before proceeding.

The above assumes that the DNS software supports multiple questions per query. The alternative is to periodically poll 127.0.0.1, which could be done by the parent process and so disable a blacklist before spawning new children.

Comment 33 mouss 2009-01-25 01:39:57 UTC
(In reply to comment #30)
> You (In reply to comment #29)
> > this is getting OT.  but we could use the "admin's email address" field that
> 
> Half this ticket is off-topic, what's a little more.
> 
> You neglect package installs, which do not pose the question concerning admin
> address.
> 
> > we set during Makefile.PL, and periodically (or randomly) perform a DNS query
> > to "jm.at.jmason.org.admin-address.multi.uribl.com", hence notifying the URIBL
> > of the admin address for that querying IP ;)
> 
> Meh.
> 
> I think Steve Freegard's suggestion testing for 127.0.0.1 based on the Internet
> draft for DNSBL BCP is better. Your suggestion would have to work it's way into
> the Internet draft to make sure all lists adopted it. The 127.0.0.1 test is
> workable now IMO.
> 

and this or alternatives can be implemented in a cron job, instead of SA. After all, such issues are not specific to SA.

That said, maybe one or two lines in INSTALL and other docs are worth the pain?
Comment 34 AXB 2009-02-26 14:07:12 UTC
FYI:

Uribl.com's Positive ACLs have now been disabled and replaced with split-horizon dns filtering.

For more info, see http://www.uribl.com/index.shtml
Comment 35 Steve Freegard 2009-02-26 14:44:54 UTC
(In reply to comment #34)
> FYI:
> 
> Uribl.com's Positive ACLs have now been disabled and replaced with
> split-horizon dns filtering.
> 
> For more info, see http://www.uribl.com/index.shtml
> 

Thanks and well done to everyone involved in making this happen.
Comment 36 Justin Mason 2009-02-26 15:54:30 UTC
BLOCKED - SPLIT-HORIZON DNS FILTER

  # host -tA blocked.uribl.com
  blocked.uribl.com has address 127.0.0.255

  * A 'ping' instead of 'host -tA' will also work. 
  * A negative response means the NS is not bLocked at this level.



cool.  Would it be worthwhile adding code to SA to randomly query that hostname to check to see if queries are blocked?