Bug 7366 - Further tests for message claiming to be but isn't
Summary: Further tests for message claiming to be but isn't
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 3.4.1
Hardware: PC Linux
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-31 18:39 UTC by Philip Prindeville
Modified: 2020-01-14 22:42 UTC (History)
6 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Supposedly 7bit message with ISO-8859-1 characters in it message/rfc822 None Philip Prindeville [HasCLA]
Rule to match text/plain message w/ 7bit CTE and 8bit bodies text/plain None Philip Prindeville [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Philip Prindeville 2016-10-31 18:39:37 UTC
Created attachment 5417 [details]
Supposedly 7bit message with ISO-8859-1 characters in it

This is a follow-on to bug 7063, for the degenerate case of a non-multipart message.

Attaching a rule to catch this, plus a message we received that corresponds to this rule.
Comment 1 Philip Prindeville 2016-10-31 18:40:40 UTC
Created attachment 5418 [details]
Rule to match text/plain message w/ 7bit CTE and 8bit bodies
Comment 2 Dave Jones 2018-08-26 23:49:47 UTC
Testing this in my sandbox area of the ruleqa masscheckers.
Comment 3 Castro B 2019-06-18 12:07:16 UTC
Does the issue already fixed? any current update? 
Thanks!

Castro B,
https://w.gratisdatingsite.nl/
Comment 4 Henrik Krohns 2019-10-02 12:01:18 UTC
Ruleqa looks ok for L_8BIT_MISMATCH, though my corpus is full of FPs for it - mostly from one specific automatic ordering system, but many other automatic messages too.

Is Dave alive? You should publish it.. maybe limit score to 2 or so..
Comment 5 Henrik Krohns 2020-01-14 15:11:23 UTC
Publishing as CTE_8BIT_MISMATCH

Sending        rulesrc/sandbox/davej/20_non_ascii.cf
Transmitting file data .done
Committing transaction...
Committed revision 1872783.
Comment 6 RW 2020-01-14 17:26:55 UTC
(In reply to Henrik Krohns from comment #5)
> Publishing as CTE_8BIT_MISMATCH

The scored rule was renamed but __L_CTE_7BIT, __L_CTE_8BIT and __L_BODY_8BITS are left.

I'm guessing that L stands for local, and I think it would be a good idea to make it policy that L_ is reserved for local rules.
Comment 7 Henrik Krohns 2020-01-14 19:13:32 UTC
(In reply to RW from comment #6)
>
> I'm guessing that L stands for local, and I think it would be a good idea to
> make it policy that L_ is reserved for local rules.

Documentation only specifies T_. I really couldn't bother, no one even sees those.
Comment 8 RW 2020-01-14 22:27:44 UTC
It's not about anyone seeing them, it's about avoiding name collisions between the core rules and local rules. At the moment there's no standard way of doing this. Someone people use long prefixes, but these reduce readability and bloat meta-rules. And there's no guarantee that someone wont pick the same prefix and contribute their rules to core.

L_  seems like the ideal candidate for a reserved prefix.
Comment 9 John Hardin 2020-01-14 22:42:48 UTC
(In reply to RW from comment #8)
> It's not about anyone seeing them, it's about avoiding name collisions
> between the core rules and local rules. At the moment there's no standard
> way of doing this. Someone people use long prefixes, but these reduce
> readability and bloat meta-rules. And there's no guarantee that someone wont
> pick the same prefix and contribute their rules to core.
> 
> L_  seems like the ideal candidate for a reserved prefix.

Along with __L_ for subrules.

+1 from me. We can even add build tooling to ensure none such ever get added to the base rules.