SA Bugzilla – Bug 5421
Please don't use SURBLs to check headers, etc.
Last modified: 2011-05-05 11:47:08 UTC
We seem to be seeing cases where SpamAssassin is resolving header domains and checking them against SURBLs. This has caused some arguable FPs where, for example, a mail server's IP address is on the ph.surbl.org phishing list due to the phishers specifying the URI that way. It's also possible that *unresolved* header domains are being checked against SURBLs. While these uses may correctly help identify some minority of spam, they also can and apparently do FP. They're also not a recommended or intended use of the data. As a side effect some (formerly) compromised mail or web servers are having some difficulty delivering mail. In the big picture this may have some benefits in mitigating or cleaning up exploits, but responding to these issues is not something we'd like to be doing. SURBL does not want to do mitigation or cleaning of compromised servers. It does want to blacklist spammed hosts. Compounding the issue somewhat is that some of our phishing data sources don't remove sites quickly enough when the phishing sites are gone. Again this causes some FPs when the data are used as described above. Therefore we recommend that SpamAssassin not use SURBLs to check other than message body URI hosts.
hi Jeff -- a sample mail demoing this would really be useful ;) thanks.
That's certainly a reasonable request, but most of the folks reporting these don't have samples, and usually their IPs get removed from the SURBL. (Usually they just say "our mail is getting blocked". We do ask every removal request include a sample, but it's usually the people in the sending business who have samples. The administrators of minor, cracked systems almost never do, nor do they seem to see the reasons why we might want to see one, which makes some sense since they're generally not professional mail sending services.) So there tend not to be samples and if there were, they would tend not to hit after we remove. That said, the next one we come across we'll try to get a sample for. Might take a while though.
But as a matter of principle, we feel that our data should only be used to check unresolved URIs, and that unintended and unexpected results can (and do) happen if they're used in other ways. (FWIW many of the cracked IP removal requests seem come from South America, and some of the administrators seem barely able to communicate, either in English or Spanish.)
As far as I know, we don't do this. The main thing that gets the list of domains to check is: my $uris = $scanner->get_uri_detail_list(); and get_uri_detail_list() gets data from body (text-parsed and HTML) URIs. No headers. We really need to see a sample message.
No examples in 4 years, close?
Yup I don't think this is a problem.
If the code doesn't pull them out for submission and we have no samples, my only other guess is perhaps a program using SA as an API rather than with the default scanning or spamc/spamd? I agree that closing a bug with no examples years later is the only course of action, though.