Bug 7891 - spamassassin expands domain by wrong TLD
Summary: spamassassin expands domain by wrong TLD
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: 3.4.4
Hardware: PC Linux
: P2 normal
Target Milestone: 3.4.7
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-15 07:49 UTC by Tobi
Modified: 2021-04-18 15:13 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Tobi 2021-03-15 07:49:06 UTC
spamassassin expands some domains by a .com TLD although the domain without .com is perfectly valid (uncommon but valid).

To reproduce: add a domain like www.ch into a mail and see spamassassin's debug output

```
spamassassin -D </tmp/redacted 2>&1|egrep 'URIHOSTS|URIDOMAINS'
Mar 15 08:43:09.824 [32657] dbg: check: tagrun - tag URIHOSTS is now ready, value: ARY:[www.ch,www.ch.com]
Mar 15 08:43:09.825 [32657] dbg: check: tagrun - tag URIDOMAINS is now ready, value: ARY:[ch.com,www.ch]
```

in the message only www.ch is present but SA expands it by .com and then checks domains completely not related to this message (ch.com and www.ch.com)

If a scheme is prepended to the uri link (ex http://www.ch) then SA does **mot** append the .com TLD
Comment 1 Henrik Krohns 2021-03-15 08:24:42 UTC
Yeah Util.pm / uri_list_canonicalize should probably only add .com to unknows TLDs.

      # http://foobar -> http://www.foobar.com as Firefox does (Bug 6596)
      # (do this here so we don't trip on those 0x123 IPs etc..)
      # https://hg.mozilla.org/mozilla-central/file/tip/docshell/base/nsDefaultURIFixup.cpp
      elsif ($proto eq 'http://' && $auth eq '' &&
             $nhost ne 'localhost' && $port eq '80' &&
             $nhost =~ /^(?:www\.)?([^.]+)$/) {
        push(@nuris, join('', $proto, 'www.', $1, '.com', $rest));

Schemeless uris are already known as valid, but this function has no knowledge if the uri was originally schemeless or not. This function also does not have any access to is_domain_valid function since Util.pm is standalone module.

I guess we could pass $self->{main}->{registryboundaries} as argument to uri_list_canonicalize, so we can use it for this purpose.

sub uri_list_canonicalize {
  my($redirector_patterns, @uris, $rb) = @_;
Comment 2 Henrik Krohns 2021-03-15 08:42:10 UTC
(In reply to Henrik Krohns from comment #1)
> sub uri_list_canonicalize {
>   my($redirector_patterns, @uris, $rb) = @_;

Uh we can't do this since @uris is not passed as reference *facepalm*.. kinda hard to fix this without breaking compatibility (if we care about it in this case).
Comment 3 Henrik Krohns 2021-03-15 09:12:23 UTC
uri_list_canonicalize now detects if it's called by new usage style and will use $rb to check for valid domain in this case.

  my @uris;
  my $rb;
  if (ref($_[0]) eq 'ARRAY') {
    # New call style:
    # - reference to array of redirector_patterns
    # - reference to array of URIs
    # - reference to $self->{main}->{registryboundaries}
    @uris = @{$_[0]};
    $rb = $_[1];
  } else {
    # Old call style:
    # - reference to array of redirector_patterns
    # - rest of the arguments is list of uris
    @uris = @_;
  }



Committed to trunk. 3.4 is in RTC mode, needs votes for possible commit.

Sending        trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm
Sending        trunk/lib/Mail/SpamAssassin/Util.pm
Sending        trunk/t/uri.t
Transmitting file data ...done
Committing transaction...
Committed revision 1887669.
Comment 4 Sidney Markowitz 2021-04-08 09:31:12 UTC
(In reply to Henrik Krohns from comment #3)
> Committed to trunk. 3.4 is in RTC mode, needs votes for possible commit.

As per discussion on dev list, changing milestone from 3.4.6 to 4.0.0.

I'm leaving it open until after the 3.4.6 release, as then Henrik, you can decide if you want to commit it to the 3.4 branch and close it with a 3.4.7 milestone, or just close it as Fixed for 4.0.0.
Comment 5 Henrik Krohns 2021-04-18 15:13:43 UTC
Backported to 3.4 as I feel it's simple and safe.

Sending        spamassassin-3.4/lib/Mail/SpamAssassin/PerMsgStatus.pm
Sending        spamassassin-3.4/lib/Mail/SpamAssassin/Util.pm
Sending        spamassassin-3.4/t/uri.t
Transmitting file data ...done
Committing transaction...
Committed revision 1888907.