Bug 7256 - URI Host rule duplicates
Summary: URI Host rule duplicates
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-20 17:54 UTC by RW
Modified: 2020-07-23 19:38 UTC (History)
4 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description RW 2015-10-20 17:54:29 UTC
There look to be some rule duplicates:

body   URI_HOST_IN_BLACKLIST    eval:check_uri_host_in_blacklist()
header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK')

body      URI_HOST_IN_WHITELIST eval:check_uri_host_in_whitelist()
header 	  HEADER_HOST_IN_WHITELIST eval:check_uri_host_listed('WHITE')

check_uri_host_in_blacklist() is equivalent to check_uri_host_listed('BLACK') and the whitelist versions are analogous.
Comment 1 Joe Quinn 2015-10-20 18:01:07 UTC
They are header and body alternatives. If the eval calls are identical, it's a bit silly to be using different forms for each rule but otherwise they aren't duplicates.

Can you construct an example of where they should be different but are not?
Comment 2 RW 2015-10-20 18:59:26 UTC
I don't see any evidence in the code for separate header tests, and the result cache that check_uri_host_listed() uses has a single boolean per list.
Comment 3 Amir Caspi 2015-10-20 20:48:29 UTC
(In reply to Joe Quinn from comment #1)
> Can you construct an example of where they should be different but are not?

In the SA user list, I posted two spamples where HEADER_HOST_IN_BLACKLIST pops but the specified BL host is absolutely _not_ in the header.  This is, in fact, how this issue was first discovered.

Spample #1: http://pastebin.com/vpXAVjaH
Spample #2: http://pastebin.com/B3kFg4Xn

In both cases, the BL host appears in the body, not in the headers.  Thus, the eval is apparently not distinguishing between header and body, causing the HEADER rules to hit erroneously.
Comment 4 Karsten Bräckelmann 2017-10-18 23:23:15 UTC
You are both correct, RW and Amir.

The eval() rule is the same in URI_HOST_IN_BLACKLIST and its HEADER_* counterpart (one variant is just a named shortcut to the general check_uri_host_listed() function).

With eval() rules, defining the rule as header or body does NOT limit the eval function's scope but determines only which part of the message the rule's score should be accounted for.

Minimal test case: Ad-hoc message with no headers, header-body separator, and a URI in the body. Fed to spamassassin with a user URI blacklist option of the entire TLD .org.

$ echo -e "\n\n spamassassin.apache.org" |
  ./spamassassin -L --cf="blacklist_uri_host org"

Relevant header excerpt showing the _REPORT_ template tag rules's hit:

X-Spam-Report: 
  *  100 HEADER_HOST_IN_BLACKLIST Host or Domain in header is listed in the
  *      user's URI black-list
  *      [URI: spamassassin.apache.org (org)]
  *  100 URI_HOST_IN_BLACKLIST BODY: Host or Domain is listed in the user's
  *      URI black-list
  *      [URI: spamassassin.apache.org (org)]


Removed HEADER_HOST_IN_BLACKLIST and *_WHITELIST rules. Committed to trunk and stable 3.4 branch respectively.

Sending        rules/60_whitelist.cf
Committed revision 1812594.

Sending        rules/60_whitelist.cf
Committed revision 1812595.

Closing RESOLVED FIXED.
Comment 5 Kevin A. McGrail 2020-07-23 19:38:35 UTC
svn commit -m 'small rule artifact fix bz 7256'
Sending        50_scores.cf
Transmitting file data .
Committed revision 1880225.