SA Bugzilla – Bug 370
Better RBL handling
Last modified: 2002-06-09 15:34:16 UTC
As promised, here's the patch in bugzilla, along with the announcement Email So, why did I have to write this patch? 1) I use MAPS DUL and relays.osirusoft.com, which also has a DUL section. The problem is that I had machines that were being penalized twice for being on more than one DUL. This actually let to some non SPAM being reported as spam more than once (and my threshold is 7, not even 5) 2) Furthermore, we shouldn't overly penalize people because they're sending mail from a dialup IP if they properly relayed through their ISP 3) There was no support for querying multi RBL zones like relays.osirusoft.com Let's say an IP is flagged with a score of 2.0 as an open relay in orbs. Because there is already a match for set relay, osirusoft checks won't run against it, even if you had a match (2.0) plus a return code of # 127.0.0.6 which would have given you another 3.0, so you lose a perfect 5.0 score. 4) Probably a few other problems of that sort Over the last 7-10 days, I tried different ways to fix this, some being rather misguided, trying not to run tests if other ones ran, and having overrides to ignore the first IP for dul, which gets interesting if you compare checks in set dialup and checks in set relay which can return a match of dialup. Needless to say, this went nowhere, I couldn't understand my own code before long. The next idea, to change the score of some rules to 0 if other ones already matched seemed misguided too, especially since I wasn't sure if it wouldn't cause problems with spamd I eventually came up with this: adding rules to counter other ones, and using a function called check_two_rbl_results to add a negative score if two RBLs matched on the same thing. Putting osirusoft in set relay was also a mistake, I've put it in its own osirusoft set since it can have many different meanings. Last, but not least, an RBL rule that ends with -firsthop is magic, it only matches on the originating IP provided there is a relay in the middle The rest should make sense if you look at the diff and the example RBL rules in the docs
Created attachment 129 [details] patch file
I don't really like "counteract" type rules. I don't know what works best for the GA, but what I've done for some other rules (future and past dates, relay tests, etc.) is make similar rules not overlap with the expectation that the spammier versions will get higher scores. Also, the intent is for machines to be penalized more than once! The more places a machine is reported, the more we can believe that the RBL is correct. Would it be feasible to separate the DUL tests as follows? # rules for mail sent through DUL and not relayed through ISP RCVD_IN_DUL_1 - machine appears in one DUL list RCVD_IN_DUL_2 - machine appears in two DUL lists RCVD_IN_DUL_3 - machine appears in three DUL lists RCVD_IN_DUL_4_MORE - machine appears in four or more DUL lists RCVD_IN_DUL_ISP_1 - machine appears in one DUL list and is relayed through ISP RCVD_IN_DUL_ISP_2 - machine appears in two DUL lists and is relayed through ISP ... Machines only test positive for one of the above rules or none. Not both. Very similar to MSG_ID_ADDED_BY_MTA_2 and MSG_ID_ADDED_BY_MTA_3. Assign differing scores to each. It may be feasible to stop testing DUL lists after the first three positive results, so you could just have the last test be "three or more"
You talk about the GA, but it's not relevant here, the GA doesn't run against RBLs If you look at my code and rules closer, you'll notice that you don't get penalized twice for being an open relay or a dialup IP (although you can actually have each give you a score of 2, and counteract with just -1 to penalize a bit more). You should be penalized for being a confirmed spammer (127.0.0.6 or 8 on osirusoft) or being on the MAPS RBL (the confirmed spammer list) However if you query 3 RBLs and they all tell you that: - it's a dialup IP - it's an open relay Do you give a score of 6 right away? My scheme lets you: In case #1, you probably only penalize with 2, no matter how many DULs you're on In case #2, you can do the same, or give a slightly higher score if you're on 2 or 3 open relay lists, but again, do you want to plan mark as spam a mail that's on 3 open relay RBLs? My patch is pretty small, and yet it took me more than a week to come up to it. It's not because I can't code, it's because I tried different approaches and gave this a lot of though. Mind you, what I propose is not infinitely flexible, but takes care of most cases a lot better than the current code (which is inconsistent) What you propose with RCVD_IN_DUL_ISP_1 and RCVD_IN_DUL_4_MORE, is one of the things I tried to do initially. You'll however notice that there are *many* combinations, and it gets non trivial once you deal with multiple blacklist RBLs like relays.osirusoft.com or RBL+ Are we going to have RCVD_IN_2DUL_1RSS_1SPAMIP RCVD_IN_2DUL_1RSS_2SPAMIP RCVD_IN_2DUL_2RSS_1SPAMIP ... If you want to be thorough, it just gets very complex. I think my scheme offers reasonable flexibility while not introducing lots of new complex code.
I, another thing I forgot: The reason why I went with my scheme too, is that you don't get: Is on 2 DULs and 1 RSS You get: Is on MAPS DUL, OSIRUSOFT DUL, and ORBS RSS You know exactly which RBLs matched, and with which return IP
Has been merged into #399 *** This bug has been marked as a duplicate of 399 ***