Bug 5421 - Please don't use SURBLs to check headers, etc.
Summary: Please don't use SURBLs to check headers, etc.
Status: RESOLVED WORKSFORME
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: 3.1.8
Hardware: Other All
: P3 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL: http://www.surbl.org/implementation.html
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-04-18 00:05 UTC by Jeff Chan
Modified: 2011-05-05 11:47 UTC (History)
3 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Jeff Chan 2007-04-18 00:05:07 UTC
We seem to be seeing cases where SpamAssassin is resolving header domains and
checking them against SURBLs.  This has caused some arguable FPs where, for
example, a mail server's IP address is on the ph.surbl.org phishing list due to
the phishers specifying the URI that way.  It's also possible that *unresolved*
header domains are being checked against SURBLs.  While these uses may correctly
help identify some minority of spam, they also can and apparently do FP. 
They're also not a recommended or intended use of the data.

As a side effect some (formerly) compromised mail or web servers are having some
difficulty delivering mail.  In the big picture this may have some benefits in
mitigating or cleaning up exploits, but responding to these issues is not
something we'd like to be doing.  SURBL does not want to do mitigation or
cleaning of compromised servers.  It does want to blacklist spammed hosts. 
Compounding the issue somewhat is that some of our phishing data sources don't
remove sites quickly enough when the phishing sites are gone.  Again this causes
some FPs when the data are used as described above.

Therefore we recommend that SpamAssassin not use SURBLs to check other than
message body URI hosts.
Comment 1 Justin Mason 2007-04-18 02:04:48 UTC
hi Jeff -- a sample mail demoing this would really be useful ;)  thanks.
Comment 2 Jeff Chan 2007-04-18 02:56:41 UTC
That's certainly a reasonable request, but most of the folks reporting these
don't have samples, and usually their IPs get removed from the SURBL.  (Usually
they just say "our mail is getting blocked".  We do ask every removal request
include a sample, but it's usually the people in the sending business who have
samples.  The administrators of minor, cracked systems almost never do, nor do
they seem to see the reasons why we might want to see one, which makes some
sense since they're generally not professional mail sending services.)  So there
tend not to be samples and if there were, they would tend not to hit after we
remove.

That said, the next one we come across we'll try to get a sample for.  Might
take a while though.
Comment 3 Jeff Chan 2007-04-18 03:02:02 UTC
But as a matter of principle, we feel that our data should only be used to check
unresolved URIs, and that unintended and unexpected results can (and do) happen
if they're used in other ways.

(FWIW many of the cracked IP removal requests seem come from South America, and
some of the administrators seem barely able to communicate, either in English or
Spanish.)
Comment 4 Theo Van Dinter 2007-04-18 10:00:35 UTC
As far as I know, we don't do this.  The main thing that gets the list of
domains to check is:

  my $uris = $scanner->get_uri_detail_list();

and get_uri_detail_list() gets data from body (text-parsed and HTML) URIs.  No
headers.

We really need to see a sample message.
Comment 5 Darxus 2011-05-04 20:57:03 UTC
No examples in 4 years, close?
Comment 6 Henrik Krohns 2011-05-05 07:25:37 UTC
Yup I don't think this is a problem.
Comment 7 Kevin A. McGrail 2011-05-05 11:47:08 UTC
If the code doesn't pull them out for submission and we have no samples, my only other guess is perhaps a program using SA as an API rather than with the default scanning or spamc/spamd?

I agree that closing a bug with no examples years later is the only course of action, though.