Bug 7651 - Invalid domains in uri parser
Summary: Invalid domains in uri parser
Status: RESOLVED DUPLICATE of bug 7736
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P2 major
Target Milestone: 4.0.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
: 5317 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-11-05 15:18 UTC by Henrik Krohns
Modified: 2019-07-12 08:39 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Henrik Krohns 2018-11-05 15:18:12 UTC
As discussed on mailing list. Opening this to investigate what kinds of crap end up in uri lists especially with the schemeless uri parser.

[a-z\d][a-z\d._-]{0,251}\.${tldsRE}

Seems a bit simple since it can match anything like a "1-------------------------------------------------------------------------------------------------------------.com".

Perhaps check hostname validity more carefully, characters, individual part length (<64) etc.


On Mon, Nov 05, 2018 at 02:44:29PM +0000, RW wrote:
> On Sun, 04 Nov 2018 19:28:02 -0500
> Bill Cole wrote:
>
> > On 4 Nov 2018, at 16:27, Henrik K wrote:
> >
> > > Can someone actually register and use a domain with underscore in
> > > it?
> >
> > No.
> >
> ...
> > I support the concept of not treating domain-name-like strings that
> > are not valid hostnames as if they are URI domain-parts. That would
> > mean anything with an underscore. It MIGHT be more prudent to exempt
> > leading-underscore labels, as those can be legal domain names that
> > could have CNAME or DNAME records mapping them to working hostnames.
>
> I created an A-record at Namecheap for a_b.mydomain.tld and
> neither firefox nor chromium had a problem with it.
>
> I think the ideal would be to allow underscores when parsing-out domain
> names and then discard anything with an underscore in the registered
> part.

I've applied this to trunk.  Since it's mainly problem with unnecessary
URIBL queries, that's what I've patched for now.  Need to ponder if it's ok
to filter completely out of get_uri_detail_list internals.


Sending        lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm
Transmitting file data .done
Committing transaction...
Committed revision 1845807.
Comment 1 Henrik Krohns 2019-06-24 15:38:34 UTC
*** Bug 5317 has been marked as a duplicate of this bug. ***
Comment 2 Henrik Krohns 2019-06-24 15:39:07 UTC
Should look at valid hostname parser.
Comment 3 Henrik Krohns 2019-07-12 08:39:29 UTC
Continue in Bug 7736

*** This bug has been marked as a duplicate of bug 7736 ***