Bug 7760

Summary: Document aux_tld / util_rb_2tld usage for Mail::SpamAssassin::Plugin::URIDNSBL
Product: Spamassassin Reporter: Bernhard Schmidt <berni>
Component: DocumentationAssignee: SpamAssassin Developer Mailing List <dev>
Status: NEW ---    
Severity: normal CC: kmcgrail
Priority: P2    
Version: 3.4.2   
Target Milestone: Undefined   
Hardware: PC   
OS: Linux   
Whiteboard:

Description Bernhard Schmidt 2019-10-11 15:14:27 UTC
The readme of https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_URIDNSBL.html states

---
An RHSBL zone is one where the domain name is looked up, as a string; e.g. a URI using the domain foo.com will cause a lookup of foo.com.uriblzone.net. Note that hostnames are stripped from the domain used in the URIBL lookup, so the domain foo.bar.com will look up bar.com.uriblzone.net, and foo.bar.co.uk will look up bar.co.uk.uriblzone.net.
---

However, the algorithm is described nowhere, and it was seemingly undeterministic (we had weebly.com in the uribl for hosting phishing sites and it did not match, tcpdump showed it was querying for the full name). It took me a while to figure out that this is controlled by the RegistrarBoundaries in the 20_aux_tlds.cf ruleset.
Comment 1 Kevin A. McGrail 2019-10-11 17:57:15 UTC
Agreed that our 2TLD and 3TLD algorithms might need better documentation.  Patches welcome.