Bug 6807

Summary: TVD_RCVD_SINGLE regex line
Product: Spamassassin Reporter: Jake <stoked10>
Component: RulesAssignee: SpamAssassin Developer Mailing List <dev>
Status: NEW ---    
Severity: normal CC: kmcgrail
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   
Whiteboard:

Description Jake 2012-06-13 12:23:06 UTC
RFCs (1034 and 1035) 
http://tools.ietf.org/html/rfc1034
http://tools.ietf.org/html/rfc1035

state that a domain name can consist of lower and upper case characters and that it is to be case insensitive. However TVD_RCVD_SINGLE in 72_active.cf only checks for all lower case domain names:

header TVD_RCVD_SINGLE Received =~ /^from\s+(?!localhost)[^\s.a-z0-9-]+\s/

I believe it should be:

header TVD_RCVD_SINGLE Received =~ /^from\s+(?!localhost)[^\s.a-zA-Z0-9-]+\s/

There's probably more that this change should be expanded to, this is just the one that I see hitting alot. Please advise if I missed the mark here or if this is done for a specific reason.

Thanks!

-Jake
Comment 1 Mark Martinec 2012-06-26 19:41:49 UTC
> state that a domain name can consist of lower and upper case characters and
> that it is to be case insensitive. However TVD_RCVD_SINGLE in 72_active.cf
> only checks for all lower case domain names:
> header TVD_RCVD_SINGLE Received =~ /^from\s+(?!localhost)[^\s.a-z0-9-]+\s/
> I believe it should be:
> header TVD_RCVD_SINGLE Received =~ /^from\s+(?!localhost)[^\s.a-zA-Z0-9-]+\s/

You are right about case insensitivity of domain names.
But RFC 5321 also states that EHLO name should be a FQDN
or an address literal:

  The domain name given in the EHLO command MUST be either a primary
  host name (a domain name that resolves to an address RR) or, if
  the host has no name, an address literal

So even if a domain name is all capitals, it should contain
at least one dot, thus saving it from the TVD_RCVD_SINGLE rule.

> There's probably more that this change should be expanded to, this is just
> the one that I see hitting alot. Please advise if I missed the mark here or
> if this is done for a specific reason.

The rule is inexact one way or another. I don't know good does it do
in Q&A tests. The problem would be if a perfectly valid Received header
field would fire it, but apparently this is not the case.
Comment 2 Mark Martinec 2012-06-26 23:51:14 UTC
> header TVD_RCVD_SINGLE Received =~ /^from\s+(?!localhost)[^\s.a-zA-Z0-9-]+\s/

Btw, this rule only applies to the topmost Received header field
(because it uses the ^ anchor but is missing an /m flag), i.e. it only
applies to the Received field added by our own MTA in some setups.
I doubt that was the intention, I guess it should be processing
all Received header fields.

There are a couple of other similar rules using ^ or $ anchors
for a header field that can appear multiple times in a header,
but are missing the /m regexp modifier flag:

20_ratware.cf:

header RATWARE_RCVD_PF Received =~ / \(Postfix\) with ESMTP id [^;]+\; \S+ \d+ \S+ \d+ \d+:\d+:\d+ \S+$/s

72_active.cf:

header TVD_RCVD_IP Received =~ /^from\s+(?:\d+[^0-9a-zA-Z\s]){3}\d+[.\s]/
header TVD_RCVD_IP4 Received =~ /^from\s+(?:\d+\.){3}\d+\s/
header TVD_RCVD_SINGLE Received =~ /^from\s+(?!localhost)[^\s.a-z0-9-]+\s/
header __FSL_HELO_USER_2 Received =~ /from User(?:\s+by|\s*\(|$)/i

Should these be fixed?
Comment 3 Kevin A. McGrail 2013-06-21 16:27:50 UTC
This is a discussion about a Rules which doesn't appear to require a target milestone.