Bug 7865 - FUZZY_CREDIT rule fires for word Accreditation and all its translations
Summary: FUZZY_CREDIT rule fires for word Accreditation and all its translations
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Translations and Languages (show other bugs)
Version: unspecified
Hardware: PC Mac OS X
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
Depends on:
Reported: 2020-10-23 13:41 UTC by info
Modified: 2020-10-29 17:08 UTC (History)
3 users (show)

Attachment Type Modified Status Actions Submitter/CLA Status
Spam Test with different clients image/png None info@picnic-terminal.ch [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description info 2020-10-23 13:41:01 UTC
Created attachment 5727 [details]
Spam Test with different clients

I have an European client which works with the Accreditation and Quality Assurance of Universities, Colleges and educational programs, a very serious and legitimate practice in the field of Education. The word is also in the name of the Organization in English, French, German and Italian (Akkreditierung, Accréditation usw.). All their Newsletters go to the Spam Folder, when Spamassassin is in use or don't eve arrive. Please create an exception for these words, if possible! Thanks in advance and best wishes.
Comment 1 RW 2020-10-23 13:58:22 UTC
score FUZZY_CREDIT 1.699 1.413 0.601 1.678

If everything is treated as spam then FUZZY_CREDIT is the tip of the iceberg.
Comment 2 info 2020-10-23 14:15:49 UTC
When one uses commercial bulk mailers, then you always get caught in a couple of issues, but FUZZY_CREDIT definitively gives it the final blow and we cannot change the activity or the name of the company, on that matter ...
Comment 3 AXB 2020-10-24 05:38:00 UTC
Please provide a related sample message.
A screenshot showing some application's output is not enough to warrant further investigation
Comment 4 RW 2020-10-24 15:52:49 UTC
I found that out of the words mentioned only "Accréditation" produced the hit as it contains "crédit".   The rule contains an exception for "é", but only if it in  IEC_8859-1, not UTF-8.
Comment 5 RW 2020-10-26 16:48:17 UTC
(In reply to RW from comment #4)
> The rule contains an exception for "é", but only if it's in  IEC_8859-1, not UTF-8.

by which I mean that "crédit" along with "kredit" and "credit" is supposed to be treated as non-obfuscated.

Whether or not anyone cares about Accréditation, it's worth fixing this to prevent FPs on "crédit" in Spanish and French.

/<inter W1>(?![ck]r(?:[e\xe9]|\xc3\xa9)dit)<C><R><E><D><I><T>/i
Comment 6 John Hardin 2020-10-29 17:08:22 UTC
Sending        svn/trunk/rules/25_replace.cf
Transmitting file data .done
Committing transaction...
Committed revision 1882973.