SA Bugzilla – Bug 3530
ALL_TRUSTED false positive
Last modified: 2004-10-03 06:41:35 UTC
to be attached momentarily
Created attachment 2058 [details] ok, so it's not exactly low scoring spam
Received: from mail (helo=afzhg1892.com) by proton.pathname.com with local-smtp (Exim 3.35 #1 (Debian)) id 1BCDmr-0006WP-00 for <quinlan@proton.pathname.com>; Sat, 10 Apr 2004 01:24:26 -0700 what's "local-smtp"? seems very wierd...
moving accuracy and some bugs to 3.1.0 milestone
more accuracy and performance bugs going to 3.1.0 milestone
With 3.0.0-rc5, I'm still seeing ALL_TRUSTED coming up when it's clear that this isn't the case. Mostly, what happens is that SA fails to detect _any_ relays, so num_relays_trusted == num_relays_untrusted == 0. Anyhow, the current logic says that ALL_TRUSTED is set unless num_untrusted > 0. I believe that this is flawed and that we should count the total received lines and not flag ALL_TRUSTED unless we can definitely say that we have recognised all the received headers? i.e. that num_relays_received == num_relays_trusted and num_relays_untrusted == 0? What do you think? Example spam attached that flags ALL_TRUSTED (IMHO) incorrectly. Cheers, Al.
Created attachment 2348 [details] Spam headers
Subject: Re: ALL_TRUSTED false positive -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Anyhow, the current logic says that ALL_TRUSTED is set unless > num_untrusted > 0. I believe that this is flawed and that we > should count the total received lines and not flag ALL_TRUSTED > unless we can definitely say that we have recognised all the > received headers? i.e. that num_relays_received == > num_relays_trusted and num_relays_untrusted == 0? > > What do you think? unfortunately many things produce "noise" Received lines. e.g. check these from this msg's headers: Received: (qmail 38657 invoked by uid 99); 15 Sep 2004 20:25:10 -0000 Received: (qmail 38670 invoked by uid 500); 15 Sep 2004 20:25:10 -0000 Received: (qmail 80326 invoked from network); 15 Sep 2004 20:25:13 -0000 Received: from localhost [127.0.0.1] by localhost with IMAP (fetchmail-6.2.5) for jm@localhost (single-drop); Thu, 16 Sep 2004 03:00:17 -0700 (PDT) all are not relay lines and should be ignored, and this proposed algorithm would work incorrectly in their presence. but fundamentally our code should be able to parse any reasonable Received line; ALL_TRUSTED is a good way to flag error cases where we can't. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFBSWfiQTcbUG5Y7woRAl+7AJ9L0uoKw+IB7B31zL0albOq5nivnwCcDbnT nkoMxYOkEf/3DaEvF39w5wA= =R2cX -----END PGP SIGNATURE-----
Yes, I can see that my proposal is flawed in this respect. I wonder if it's a good thing to have ALL_TRUSTED trigger when we haven't been able to parse any of the Received lines. It just seems to me to be an unsafe assumption. How about the following: ALL_TRUSTED is flagged where (num_received_untrusted == 0 && num_received_trusted > 0). Maybe that would better cover the cases that I see, where ALL_TRUSTED is triggered only because SA isn't able to parse any of the headers at all. Maybe it would also be good to add the corner case where SA is called and there just aren't any Received lines at all, for whatever reason. If you agree that SA should be able to parse headers that a respectable MTA would add, then this might be a solution to the issue: i.e that I'm seeing a fair amount of spam coming through where ALL_TRUSTED is flagged when it really shouldn't be. What do you think? Cheers, Al.
'I wonder if it's a good thing to have ALL_TRUSTED trigger when we haven't been able to parse any of the Received lines. It just seems to me to be an unsafe assumption.' a mail sent and received on the same machine using qmail would fit this description. It should hit ALL_TRUSTED by definition.
I can see your point: there are also cases where the current behaviour is correct. How do you think we can improve the behaviour of ALL_TRUSTED? Maybe we need to work on the parsing of the received lines - it strikes me that there could be some improvement there which would then lead to being able to better have ALL_TRUSTED show up when it should, and not when it shouldn't. Most of the spam that makes it past my "definitely-spam" mailbox and into my "possibly-spam" mailbox is there because it gets a big negative score from ALL_TRUSTED. This wouldn't be so bad, except that in the vast majority of these cases, ALL_TRUSTED is flagged when it shouldn't be.
Subject: Re: ALL_TRUSTED false positive So an unrecognised handover should count as untrusted rather than being ignored ? Maybe with some check for only bracketed terms as opposed to completely unrecognised. Are there any MTAs that don't put a "Received: ... from ... by" header with the incoming host details ? Nick
Chiming in with Al, I've got some folks who've talked to me about their SA 3 setups, and everything works great.....except ALL_TRUSTED is firing on nearly everything. I see both sides that have been expressed, but it looks to me like this can be a very dangerous rule in some environments.
AFAIK, ALL_TRUSTED fires when it can't find the client IP in the Received: header. I had this problem as well because SpamAssassin couldn't parse the correct info from the received header on my XMail Server. Once I added a new perl reg exp to give it the correct info, I've not had this problem since. In short, if it can't find an IP, it assumes you didn't put one in the Received header because it came from an internal source... hence trusted.
TBH, I was expecting to hear some bug reports of SA not parsing some people's Received headers correctly. Being able to parse those headers is essential for a number of other tests too: DNSBL lookups, SPF, future IP/HELO-based rules. Anyone seeing that behaviour should feel free to open a bug so we can get them all fixed in 3.0.1 ;)
Subject: Re: ALL_TRUSTED false positive We should not be trusting headers because of a failure to parse. That's the greater bug, I think.
Al -- that message is parsed correctly in 3.1.0, so that's fixed afaics. Daniel -- the Received hdr from your mail is: Received: from mail (helo=afzhg1892.com) by proton.pathname.com with local-smtp (Exim 3.35 #1 (Debian)) id 1BCDmr-0006WP-00 for <quinlan@proton.pathname.com>; Sat, 10 Apr 2004 01:24:26 -0700 I don't think we can do anything with this. where's the IP address?! (in fact, I'd go looking to see what this "local-smtp" deal is on proton.) So WORKSFORME.