7366 – Further tests for message claiming to be but isn't

Bug 7366 - Further tests for message claiming to be but isn't

Summary: Further tests for message claiming to be but isn't

Status:	RESOLVED FIXED

Alias:	None

Product:	Spamassassin
Classification:	Unclassified
Component:	Rules (show other bugs)
Version:	3.4.1
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	Undefined
Assignee:	SpamAssassin Developer Mailing List

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2016-10-31 18:39 UTC by Philip Prindeville
Modified:	2020-01-14 22:42 UTC (History)
CC List:	6 users (show)

Attachment	Type	Actions	Submitter/CLA Status
Supposedly 7bit message with ISO-8859-1 characters in it	message/rfc822	None	Philip Prindeville
Rule to match text/plain message w/ 7bit CTE and 8bit bodies	text/plain	None	Philip Prindeville
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Philip Prindeville 2016-10-31 18:39:37 UTC

Created attachment 5417 [details]
Supposedly 7bit message with ISO-8859-1 characters in it

This is a follow-on to bug 7063, for the degenerate case of a non-multipart message.

Attaching a rule to catch this, plus a message we received that corresponds to this rule.

Comment 1 Philip Prindeville 2016-10-31 18:40:40 UTC

Created attachment 5418 [details]
Rule to match text/plain message w/ 7bit CTE and 8bit bodies

Comment 2 Dave Jones 2018-08-26 23:49:47 UTC

Testing this in my sandbox area of the ruleqa masscheckers.

Comment 3 Castro B 2019-06-18 12:07:16 UTC

Does the issue already fixed? any current update? 
Thanks!

Castro B,
https://w.gratisdatingsite.nl/

Comment 4 Henrik Krohns 2019-10-02 12:01:18 UTC

Ruleqa looks ok for L_8BIT_MISMATCH, though my corpus is full of FPs for it - mostly from one specific automatic ordering system, but many other automatic messages too.

Is Dave alive? You should publish it.. maybe limit score to 2 or so..

Comment 5 Henrik Krohns 2020-01-14 15:11:23 UTC

Publishing as CTE_8BIT_MISMATCH

Sending        rulesrc/sandbox/davej/20_non_ascii.cf
Transmitting file data .done
Committing transaction...
Committed revision 1872783.

Comment 6 RW 2020-01-14 17:26:55 UTC

(In reply to Henrik Krohns from comment #5)
> Publishing as CTE_8BIT_MISMATCH

The scored rule was renamed but __L_CTE_7BIT, __L_CTE_8BIT and __L_BODY_8BITS are left.

I'm guessing that L stands for local, and I think it would be a good idea to make it policy that L_ is reserved for local rules.

Comment 7 Henrik Krohns 2020-01-14 19:13:32 UTC

(In reply to RW from comment #6)
>
> I'm guessing that L stands for local, and I think it would be a good idea to
> make it policy that L_ is reserved for local rules.

Documentation only specifies T_. I really couldn't bother, no one even sees those.

Comment 8 RW 2020-01-14 22:27:44 UTC

It's not about anyone seeing them, it's about avoiding name collisions between the core rules and local rules. At the moment there's no standard way of doing this. Someone people use long prefixes, but these reduce readability and bloat meta-rules. And there's no guarantee that someone wont pick the same prefix and contribute their rules to core.

L_  seems like the ideal candidate for a reserved prefix.

Comment 9 John Hardin 2020-01-14 22:42:48 UTC

(In reply to RW from comment #8)
> It's not about anyone seeing them, it's about avoiding name collisions
> between the core rules and local rules. At the moment there's no standard
> way of doing this. Someone people use long prefixes, but these reduce
> readability and bloat meta-rules. And there's no guarantee that someone wont
> pick the same prefix and contribute their rules to core.
> 
> L_  seems like the ideal candidate for a reserved prefix.

Along with __L_ for subrules.

+1 from me. We can even add build tooling to ensure none such ever get added to the base rules.