SA Bugzilla – Bug 728
Integrating an SMTP-callback (like Mail::CheckUser)
Last modified: 2002-12-13 20:54:09 UTC
I have very good experience with the smtp-callback-option provided by Exim. It works like this: when Exim receives the envelope-from "a@dom.ain" in the smtp-dialogue, it opens a new smtp-connection to one of the appropriate mx-servers and trys "mail from: <>" "rcpt to: "a@dom.ain". If exim receives a success code, the mail is accepted, if it receives a 5xx-response it is permanently rejected, in case of a 4xx-response it is temporarily rejected. The reasoning: if "a" does not exist at "dom.ain", it is most likely a Spammer using a fake-email-adress. If I get a temporary error, it might be a spam-account which hit its quota due to too many error-messages. Some big domains like "yahoo.com" will even tell you "5xx - deactivated due to abuse" or something like that. In my experience it works very well with exim and without too much overhead. Sadly there are sometimes badly configured domains or mx-servers not beeing up-to-date and giving wrong answers. This is why I don't want to reject such mails but would prefer a test (and scoring) done by SpamAssassin. Using SpamAssassin, even From, Sender and Reply-To might be evaluated. There will be performance-issues, no doubt about that, that might impose limitations for big sites, but for small installations or personal use those performance-issues should not be too bad. Like razor and realtime-blacklists a callback-check should of course be made optional. In my eyes a "perfect" solution would be to test for envelope-sender, Header-Sender, Header-From and Header-Reply-To, assigning different scores for permanent or temporary rejection. If the error-strings could be checked it might even be better. However, due to performance-issues, one test on envelope-sender of Header-From should usually be sufficient.
Well, I had hoped for some more reaction to my proposal. However, I modified "EvalTests.pm" to perform a SMTP-callback-check, using Mail::Checkuser. Basically it's just sub smtp_callback_from { my ($self) = @_; my $from = $self->get ('From:addr'); if (!check_email($from)) { return 1; } return 0; } I assigned a Score in "50_scores.cf" and set it up in 20_head_tests.cf Is there anything else I should do? Are you interested in a patch? Load seems to be no issue on my system. The CheckUser-Module has a standard-timeout of 60s, so right now messages stay up to a minute longer in the queue than before. But that can of course be modified.
Well, Justin did comment very positively on the mailing list. Anyway, I think it's a great idea, but I do have some concerns. Aside from the performance issues, I am concerned that this test could be used by spammers to verify email address reception. Also, they could even mistake an attempted SMTP connection as interest! If you're a spammer, all you have to do is assign a unique From:addr to each spam recipient. If that address is ever used, then you know someone or something is on the other end. Anti-spam people basically do the same exact thing with honeypots. I've tried to think of a way for a proxy server to perform the test and if the same address is used > N times by different users, you could assume that it was not a unique address, but I think spammers could actually use such a server to perform address verification! Argh. Maybe it's worth the tradeoff, though. My address is already in many spammer email address lists anyway.
Hmmm, I would not worry about adress-verification too much. Usually Spamers try to be as anonymous, they forge headers, use non-existing adresses, etc. So basically they can't set up a domain, set up mx-records and receive (and analyse) email, since annoyed recipients could kick ass (since a domain has to be registered, you have a name and an adress to sue). Regarding performance: I'm not an expert, but I believe it should not require much more computing power than querying two or three RBLs. You will probably get more tests resulting in timeout but that value can of course be adjusted. If it really causes problems one could try to query only "the big freemail ones". Yahoo, MSN, Hotmail etc. are often abused, but in my experience they are quite fast in disabling accounts (and thus making a callback work) and have enough well-administered mx-servers that callbacks are handeled very fast, almost never resulting in timeouts.
good points Dan. BTW another point: a *lot* of spam nowadays comes from addrs@hotmail.com, @yahoo.com and other freemail providers, to avoid the "test for valid return MX" tests many MTAs (and SA) do. There's 2 sides to this: 1. Dan's suggested issue is not a problem when the spammer does this, at least. good news. 2. those systems may not allow username verification until the DATA stage is finished -- at which stage a mail has been sent ;) In fact, the latter may be the case for a lot of mailservers out there, especially ones belonging to paranoid antispam folks. hmm... that could be a problem... it could be that this action will be considered quite anti-social on today's internet. :(
http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.0/0340.html is a mail to l-k from Marc Merlin (who's an SA user too ;) detailing sf.net's back-probing setup. http://www.uwsg.iu.edu/hypermail/linux/kernel/0209.0/0308.html is a mail from vger's admin giving out about these probes. If SA *is* to include this code, it needs to be pretty well cached to avoid the latter problem.
[ some comments from a related email ] Mario Salzer <mario17@web.de> writes: > Hi everybody, > > I know there is a eval:check_for_from_mx() in current rules, but > I didn't found a sub utilizing SMTP to check if a senders address > really exists. Unfortunately, most mail servers are configured to lie and say all addresses exist. However, it may be the case that only clueful postmasters configure things that way, so it may still be a useful test. Have a look at this Bugzilla ticket. Someone needs to submit a full patch that can be tested on a corpus. http://bugzilla.spamassassin.org/show_bug.cgi?id=728 The test should really be a part of check_for_from_mx(). Do the test inside that routine if there is a MX/A record and set a variable to true if it passes (and set it to false if there is no MX/A record or the callback fails). Then, just have a second function that just checks for that variable. (There are several tests that work that way, for example, the MIME attachment tests and the HTML percentage tests.)
Anyone want to actually _do_ something about this enhancement request? Or do we mark it WONTFIX?
I think WONTFIX is appropriate. I reckon the load it places on mail sending MXes would be unappreciated...