Bug 7865

Summary: FUZZY_CREDIT rule fires for word Accreditation and all its translations
Product: Spamassassin Reporter: info
Component: Translations and LanguagesAssignee: SpamAssassin Developer Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: info, jhardin, rwmaillists
Priority: P2    
Version: unspecified   
Target Milestone: Undefined   
Hardware: PC   
OS: Mac OS X   
Whiteboard:
Attachments: Spam Test with different clients

Description info 2020-10-23 13:41:01 UTC
Created attachment 5727 [details]
Spam Test with different clients

I have an European client which works with the Accreditation and Quality Assurance of Universities, Colleges and educational programs, a very serious and legitimate practice in the field of Education. The word is also in the name of the Organization in English, French, German and Italian (Akkreditierung, Accréditation usw.). All their Newsletters go to the Spam Folder, when Spamassassin is in use or don't eve arrive. Please create an exception for these words, if possible! Thanks in advance and best wishes.
Comment 1 RW 2020-10-23 13:58:22 UTC
score FUZZY_CREDIT 1.699 1.413 0.601 1.678

If everything is treated as spam then FUZZY_CREDIT is the tip of the iceberg.
Comment 2 info 2020-10-23 14:15:49 UTC
When one uses commercial bulk mailers, then you always get caught in a couple of issues, but FUZZY_CREDIT definitively gives it the final blow and we cannot change the activity or the name of the company, on that matter ...
Comment 3 AXB 2020-10-24 05:38:00 UTC
Please provide a related sample message.
A screenshot showing some application's output is not enough to warrant further investigation
Comment 4 RW 2020-10-24 15:52:49 UTC
I found that out of the words mentioned only "Accréditation" produced the hit as it contains "crédit".   The rule contains an exception for "é", but only if it in  IEC_8859-1, not UTF-8.
Comment 5 RW 2020-10-26 16:48:17 UTC
(In reply to RW from comment #4)
> The rule contains an exception for "é", but only if it's in  IEC_8859-1, not UTF-8.

by which I mean that "crédit" along with "kredit" and "credit" is supposed to be treated as non-obfuscated.

Whether or not anyone cares about Accréditation, it's worth fixing this to prevent FPs on "crédit" in Spanish and French.


/<inter W1>(?![ck]r(?:[e\xe9]|\xc3\xa9)dit)<C><R><E><D><I><T>/i
Comment 6 John Hardin 2020-10-29 17:08:22 UTC
Sending        svn/trunk/rules/25_replace.cf
Transmitting file data .done
Committing transaction...
Committed revision 1882973.