SA Bugzilla – Bug 7365
URIs containing parts of TLD .net receive URI_OBFU_WWW score
Last modified: 2018-08-26 21:33:30 UTC
Whenever an email contains a link with a URI like http://www.sci-net.de (this is our actual domain, where the error occurs) spamassassin returns an URI_OBFU_WWW score of 3.099. With some testing we found out that the -net part of our domain is the key to this behavior. For testing purposes we changed the link in our email templates we send to http://www.scinet.de (without the "-"). Now the email isn't flagged by spamassassin anymore. It seems that spamassassin is confused with the "-net" as part of our domain name.
What I'm seeing is that http://www.sci-net.de doesn't hit the rule but https://www.sci-net.de and www.sci-net.de does. I think a reasonable compromise would be to change the assertion from (?<!http:\/\/) to (?<!:\/\/)
How is that getting 3+ points? The score limit has been 2.000 since last March...
The masscheck S/O is abysmal, it's not respecting the score limit (possibly due to the abysmal S/O) and I can't repro this behavior in my test environment - it should not even hit .de at all, because that's not one of the TLDs it's looking for. Disabling. $ svn commit 20_uri_obfu_ws.cf Sending svn/trunk/rulesrc/sandbox/jhardin/20_uri_obfu_ws.cf Transmitting file data .done Committing transaction... Committed revision 1766914.
(In reply to John Hardin from comment #3) > The masscheck S/O is abysmal, it's not respecting the score limit (possibly > due to the abysmal S/O) and I can't repro this behavior in my test > environment - it should not even hit .de at all, because that's not one of > the TLDs it's looking for. It's finding www.sci-net, or would do if you used https://www.sci-net.de or just www.sci-net.de. It's possible that the poor S/O is caused by the general switch from http to https, so that the look-behind assertion isn't avoiding the FPs any more. Once it's generalized to include https, the rule makes sense because people commonly drop the www part outside of a proper URL and just write the domain name, so most FPs on the aggressive host name matching are avoided.
I couldn't reproduce it and I'm not reluctant to try to tune a rule I can't get a hit for. But, I will restore it with the broader exclusion and see how it does in masscheck.
(In reply to John Hardin from comment #5) > I couldn't reproduce it and I'm not reluctant to try to tune a rule I can't > get a hit for. Oops. That should be: I *am* reluctant to try to tune a rule I can't get a hit for. It didn't hit on any form of that URI. I reenabled it and made your recommended change, we'll see what masscheck says. $ svn commit 20_uri_obfu_ws.cf Sending svn/trunk/rulesrc/sandbox/jhardin/20_uri_obfu_ws.cf Transmitting file data .done Committing transaction... Committed revision 1767032.
$ printf "\n\nhttps://www.sci-net.de" |spamassassin -D 2>&1 | grep -Eo "ran body rule URI_OBFU_WWW.*" ran body rule URI_OBFU_WWW ======> got hit: "www.sci-net" $ printf "\n\nhttp://www.sci-net.de" |spamassassin -D 2>&1 | grep -Eo "ran body rule URI_OBFU_WWW.*" $ printf "\n\nwww.sci-net.de" |spamassassin -D 2>&1 | grep -Eo "ran body rule URI_OBFU_WWW.*" ran body rule URI_OBFU_WWW ======> got hit: "www.sci-net"
The rule is not being published. Closing