Bug 728 - Integrating an SMTP-callback (like Mail::CheckUser)
Summary: Integrating an SMTP-callback (like Mail::CheckUser)
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: unspecified
Hardware: All other
: P5 enhancement
Target Milestone: ---
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-08-22 14:33 UTC by Patrick von der hagen
Modified: 2002-12-13 20:54 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick von der hagen 2002-08-22 14:33:10 UTC
I have very good experience with the smtp-callback-option provided by Exim.
It works like this:
when Exim receives the envelope-from "a@dom.ain" in the smtp-dialogue, it opens
a new smtp-connection to one of the appropriate mx-servers and trys
"mail from: <>" "rcpt to: "a@dom.ain". If exim receives a success code, the mail
is accepted, if it receives a 5xx-response it is permanently rejected, in case
of a 4xx-response it is temporarily rejected.
The reasoning: if "a" does not exist at "dom.ain", it is most likely a Spammer
using a fake-email-adress. If I get a temporary error, it might be a
spam-account which hit its quota due to too many error-messages. Some big
domains like "yahoo.com" will even tell you "5xx - deactivated due to abuse" or
something like that.
In my experience it works very well with exim and without too much overhead.
Sadly there are sometimes badly configured domains or mx-servers not beeing
up-to-date and giving wrong answers. This is why I don't want to reject such
mails but would prefer a test (and scoring) done by SpamAssassin.
Using SpamAssassin, even From, Sender and Reply-To might be evaluated. There
will be performance-issues, no doubt about that, that might impose limitations
for big sites, but for small installations or personal use those
performance-issues should not be too bad. Like razor and realtime-blacklists a
callback-check should of course be made optional.

In my eyes a "perfect" solution would be to test for envelope-sender,
Header-Sender, Header-From and Header-Reply-To, assigning different scores for
permanent or temporary rejection. If the error-strings could be checked it might
even be better. However, due to performance-issues, one test on envelope-sender
of Header-From should usually be sufficient.
Comment 1 Patrick von der hagen 2002-08-28 11:58:34 UTC
Well, I had hoped for some more reaction to my proposal.

However, I modified "EvalTests.pm" to perform a SMTP-callback-check, using
Mail::Checkuser.
Basically it's just

sub smtp_callback_from {
  my ($self) = @_;
  my $from = $self->get ('From:addr');
  if (!check_email($from)) {
    return 1;
  }
  return 0;
}

I assigned a Score in "50_scores.cf" and set it up in 20_head_tests.cf

Is there anything else I should do? Are you interested in a patch?
Load seems to be no issue on my system. The CheckUser-Module has a
standard-timeout of 60s, so right now messages stay up to a minute longer in the
queue than before. But that can of course be modified.
Comment 2 Daniel Quinlan 2002-08-28 14:25:32 UTC
Well, Justin did comment very positively on the mailing list.

Anyway, I think it's a great idea, but I do have some concerns.  Aside from
the performance issues, I am concerned that this test could be used by spammers
to verify email address reception.  Also, they could even mistake an attempted
SMTP connection as interest!

If you're a spammer, all you have to do is assign a unique From:addr to each
spam recipient.  If that address is ever used, then you know someone or
something is on the other end.  Anti-spam people basically do the same exact
thing with honeypots.

I've tried to think of a way for a proxy server to perform the test and if the
same address is used > N times by different users, you could assume that it was
not a unique address, but I think spammers could actually use such a server to
perform address verification!  Argh.

Maybe it's worth the tradeoff, though.  My address is already in many spammer
email address lists anyway.

Comment 3 Patrick von der hagen 2002-08-28 14:49:31 UTC
Hmmm, I would not worry about adress-verification too much. Usually Spamers try
to be as anonymous, they forge headers, use non-existing adresses, etc. So
basically they can't set up a domain, set up mx-records and receive (and
analyse) email, since annoyed recipients could kick ass (since a domain has to
be registered, you have a name and an adress to sue).

Regarding performance: I'm not an expert, but I believe it should not require
much more computing power than querying two or three RBLs. You will probably get
more tests resulting in timeout but that value can of course be adjusted. If it
really causes problems one could try to query only "the big freemail ones".
Yahoo, MSN, Hotmail etc. are often abused, but in my experience they are quite
fast in disabling accounts (and thus making a callback work) and have enough
well-administered mx-servers that callbacks are handeled very fast, almost never
resulting in timeouts.
Comment 4 Justin Mason 2002-08-28 15:18:29 UTC
good points Dan.

BTW another point: a *lot* of spam nowadays comes from addrs@hotmail.com,
@yahoo.com and other freemail providers, to avoid the "test for valid
return MX" tests many MTAs (and SA) do.

There's 2 sides to this:

1. Dan's suggested issue is not a problem when the spammer does this,
at least. good news.

2. those systems may not allow username verification until the DATA stage
is finished -- at which stage a mail has been sent ;)

In fact, the latter may be the case for a lot of mailservers out there,
especially ones belonging to paranoid antispam folks.

hmm... that could be a problem...  it could be that this action will
be considered quite anti-social on today's internet. :(

Comment 5 Justin Mason 2002-09-03 09:58:53 UTC
http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.0/0340.html
is a mail to l-k from Marc Merlin (who's an SA user too ;)
detailing sf.net's back-probing setup.

http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.0/0308.html
is a mail from vger's admin giving out about these probes.

If SA *is* to include this code, it needs to be pretty well
cached to avoid the latter problem.
Comment 6 Daniel Quinlan 2002-09-06 13:18:48 UTC
[ some comments from a related email ]

Mario Salzer <mario17@web.de> writes:

> Hi everybody,
> 
> I know there is a eval:check_for_from_mx() in current rules, but
> I didn't found a sub utilizing SMTP to check if a senders address
> really exists.

Unfortunately, most mail servers are configured to lie and say all
addresses exist.  However, it may be the case that only clueful
postmasters configure things that way, so it may still be a useful
test.

Have a look at this Bugzilla ticket.  Someone needs to submit a full
patch that can be tested on a corpus.

  http://bugzilla.spamassassin.org/show_bug.cgi?id=728

The test should really be a part of check_for_from_mx().  Do the test
inside that routine if there is a MX/A record and set a variable to
true if it passes (and set it to false if there is no MX/A record or
the callback fails).  Then, just have a second function that just
checks for that variable.  (There are several tests that work that
way, for example, the MIME attachment tests and the HTML percentage
tests.)
Comment 7 Duncan Findlay 2002-12-13 19:46:34 UTC
Anyone want to actually _do_ something about this enhancement request? Or do we
mark it WONTFIX?
Comment 8 Justin Mason 2002-12-14 05:54:09 UTC
I think WONTFIX is appropriate.  I reckon the load it
places on mail sending MXes would be unappreciated...