SA Bugzilla – Bug 3234
RCVD_IN_SBL, RCVD_IN_XBL: missing hits
Last modified: 2004-04-27 15:13:34 UTC
70.464 85.6872 0.6750 0.992 1.00 0.00 __RCVD_IN_SBL_XBL 0.000 0.0000 0.0000 0.500 0.11 1.00 RCVD_IN_XBL 9.139 11.0924 0.1830 0.984 0.88 1.27 RCVD_IN_SBL now I can believe the RCVD_IN_SBL figures, but _XBL seems broken, and the subrule is *way* too high AFAICS.
Subject: Re: New: RCVD_IN_SBL, RCVD_IN_XBL: missing hits Justin Mason <jm@jmason.org> writes: > 70.464 85.6872 0.6750 0.992 1.00 0.00 __RCVD_IN_SBL_XBL > 0.000 0.0000 0.0000 0.500 0.11 1.00 RCVD_IN_XBL > 9.139 11.0924 0.1830 0.984 0.88 1.27 RCVD_IN_SBL > > now I can believe the RCVD_IN_SBL figures, but _XBL seems broken, and > the subrule is *way* too high AFAICS. The __RCVD_IN_SBL_XBL results are actually correct. The RCVD_IN_XBL is not being hit because spamhaus.org changed the format of the TXT record returned by the SBL-XBL blacklist. It used to include "/xbl", but now it doesn't: 63.137.169.67.sbl-xbl.spamhaus.org => "http://www.spamhaus.org/query/bl?ip=67.169.137.63" Our rule is looking for (case-insensitive) "/xbl" so we could do a TXT query to get the informative URLs *and* do a single query for SBL-XBL to reduce network traffic and processing. SBL still does include "/sbl", so that rule continues to work: 32.55.114.82.sbl-xbl.spamhaus.org => "http://www.spamhaus.org/SBL/sbl.lasso?query=SBL13063" I can think of several possible solutions: 1. See if we can get SpamHaus to include "/xbl" (or something similarly definitive in the TXT result and hope it doesn't change again. 2. Query with type ANY and use TXT if we get it, but use the A for the SBL and XBL rules. 3. Revert to using A queries only, no TXT. 4. Do separate TXT queries for SBL and XBL. My inclination was option 2 which I proceeded to implement, but before I got too far, I received some weird results from SBL-XBL for an IP in their databases where some of the answers were missing. Worse, I ultimately realized that it's impossible to attach the correct TXT record to the correct A record result so the log entry appears in the right place. If we could reliably recognize the TXT record, then there would be no issue to begin with. I wished I had figured this out before coding myself into an inevitable logical corner (*). So, we're left with options 1, 3, or 4. Oh, here's the weirdness: ------- start of cut text -------------- $ host -a 122.140.64.218.sbl-xbl.spamhaus.org Trying "122.140.64.218.sbl-xbl.spamhaus.org" ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38331 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 10, ADDITIONAL: 0 ;; QUESTION SECTION: ;122.140.64.218.sbl-xbl.spamhaus.org. IN ANY ;; ANSWER SECTION: 122.140.64.218.sbl-xbl.spamhaus.org. 2406 IN TXT "http://www.spamhaus.org/query/bl?ip=218.64.140.122" 122.140.64.218.sbl-xbl.spamhaus.org. 2406 IN TXT "http://www.spamhaus.org/SBL/sbl.lasso?query=SBL15322" ;; AUTHORITY SECTION: sbl-xbl.spamhaus.org. 85206 IN NS n.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS q.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS t.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS w.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS x.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS y.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS z.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS a.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS c.ns.spamhaus.org. sbl-xbl.spamhaus.org. 85206 IN NS e.ns.spamhaus.org. Received 344 bytes from 127.0.0.1#53 in 102 ms ------- end ---------------------------- and a few minutes later... ------- start of cut text -------------- $ host -a 122.140.64.218.sbl-xbl.spamhaus.org Trying "122.140.64.218.sbl-xbl.spamhaus.org" ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42032 ;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 10, ADDITIONAL: 0 ;; QUESTION SECTION: ;122.140.64.218.sbl-xbl.spamhaus.org. IN ANY ;; ANSWER SECTION: 122.140.64.218.sbl-xbl.spamhaus.org. 3555 IN A 127.0.0.2 122.140.64.218.sbl-xbl.spamhaus.org. 3555 IN A 127.0.0.4 122.140.64.218.sbl-xbl.spamhaus.org. 2344 IN TXT "http://www.spamhaus.org/query/bl?ip=218.64.140.122" 122.140.64.218.sbl-xbl.spamhaus.org. 2344 IN TXT "http://www.spamhaus.org/SBL/sbl.lasso?query=SBL15322" ;; AUTHORITY SECTION: sbl-xbl.spamhaus.org. 86355 IN NS n.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS q.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS t.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS w.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS x.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS y.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS z.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS a.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS c.ns.spamhaus.org. sbl-xbl.spamhaus.org. 86355 IN NS e.ns.spamhaus.org. Received 376 bytes from 127.0.0.1#53 in 91 ms ------- end ---------------------------- (*) We could use a subrule regexp looking for the IP or the "/xbl" string and it would "work" if we changed the logging code to ignore IPs if there is already a log entry and have TXT logs overwrite IP logs, but it's really too horrible to contemplate.
holy crap, that's a good hit-rate then ;) I'd suggest #1 preferred, #3 second-best.
hmm. I think this is fixed, right Dan?
72.228 82.6655 0.5057 0.994 1.00 0.00 __RCVD_IN_SBL_XBL 65.624 75.1177 0.3886 0.995 0.99 1.00 RCVD_IN_XBL 7.920 9.0556 0.1171 0.987 0.88 1.27 RCVD_IN_SBL quite fixed