SA Bugzilla – Bug 913
INVALID_MSGID low-performing rule pruned
Last modified: 2002-09-24 18:37:24 UTC
Removed INVALID_MSGID from HEAD cvs. hit frequencies: OVERALL% SPAM% NONSPAM% S/O RANK SCORE NAME 4.745 8.776 3.958 0.69 0.37 0.00 INVALID_MSGID test code from all files in rules dir: header INVALID_MSGID Message-Id !~ /^<(?:[a-zA-Z0-9.!\#$%&'*\+\/=?\^_{}|~-]+|\".+\")\@(?:[a-zA-Z0-9.-]+|\[\d{1,3}(?:\.\d{1,3}){3}\])>(?:\s*\(.*\))?\s*$/ [if-unset: <NO@MSGID>] describe INVALID_MSGID Message-Id is not valid, according to RFC 2822 lang de describe INVALID_MSGID Message-Id ist laut RFC-2822 nicht gueltig lang es describe INVALID_MSGID Message-Id no válido, de acuerdo al RFC-2822 lang fr describe INVALID_MSGID L'entête Message-ID: ne suit pas la norme RFC-2822 lang pl describe INVALID_MSGID Message-Id jest nie zgodne ze standardem RFC2822 If you want to re-add this test to SpamAssassin, please follow up this bug entry, improving the code until the S/O ratio goes above 0.7 (or below 0.3 for nice tests). (automated submission)
info from bug 824: -------------------------------------------------------------------------------- Unfortunately, backslashing the dollar sign kills the effectiveness of the rule. Before just fixing this, someone with a large corpus should really try to figure out exactly what is going on and which characters are really allowed. Only mess with this in HEAD, I think. ------- Additional Comments From Albert Meltzer 2002-09-11 10:50 ------- The fix seems to be as follows: change [a-zA-Z0-9.!\#$%&'*\+\/=?\^_{}|~-] to [-a-zA-Z0-9.!\#%&'*\+\/=?\^_{}|~$] '$' only seems work when at the end of the set; however, "~-$" would be considered a range, and so '-' is moved to the front. I tested this with 5.6.1. ------- Additional Comments From Daniel Quinlan 2002-09-11 12:06 ------- Subject: Re: INVALID_MSGID - dollar sign in rule is not backslashed > '$' only seems work when at the end of the set; however, "~-$" would > be considered a range, and so '-' is moved to the front. I tested > this with 5.6.1. You can just backslash the $. That works fine.
*** Bug 824 has been marked as a duplicate of this bug. ***
These characters are allowed in Message-Ids, according to RFC 2822: atext = ALPHA / DIGIT / ; Any character except controls, "!" / "#" / ; SP, and specials. "$" / "%" / ; Used for atoms "&" / "'" / "*" / "+" / "-" / "/" / "=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~" I escaped the dollar sign, so this rule should be fixed now.