Bug 5802 - Enhanced bare URI parsing: www(dot)prnceleb(dot)com
Summary: Enhanced bare URI parsing: www(dot)prnceleb(dot)com
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P4 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-29 20:10 UTC by Loren Wilton
Modified: 2019-08-09 14:53 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Loren Wilton 2008-01-29 20:10:44 UTC
It seems spammers are now trying to work around bare URI parsing by a number of 
tricks.  One of those seems to be the use of (dot), as in www(dot)prnceleb(dot)
com.

Perhaps it would be worthwhile modifying the bare URI parser to treat (dot) as 
a period and see what the result is.
Comment 1 Sidney Markowitz 2008-01-29 21:10:31 UTC
Having just totally rewritten the bare URI parser over the past week, I would
rather first see the results of a sandbox rule that looks for how many strings
like that are around to see if this is worth pursuing. I bet we don't even have
to use the full TLD regexp that is in PerMsgStatus for this, just something like
 /\bwww(\s{0,4}\(dot\)\s{0,4}[a-z\d-]{1,30}){1,4}\b/i

off the top of my head, and if there are a significant number of hits refine it
by checking for a good TLD at the end.
Comment 2 spamassassin 2009-07-21 05:21:59 UTC
Today I received a spam containing only the following text in its body:
Let Your Creativity Flow - Experiment With Bikilni Waxnig Styles.www[dot]med22[dot]org

It basically hit none of the Spamassassins own test even though www.med22.org is some kind of online drugstore
Subject: spatialise
X-Spam-Status: No, score=2.9 required=4.4 tests=BAYES_60,BOTNET,
     GREYLIST_ISWHITE autolearn=no version=3.2.5
Comment 3 Karsten Bräckelmann 2009-07-26 15:50:58 UTC
Huh, there's a bug for this?

It's been a recurring issue, with months of silence in between, usually died off quite quickly. Anyway, just catching static patterns, or things like (dot) doesn't cut it, as the recent wave of ever changing obfuscation has shown.

Oh, yeah, and I actually have been working on this. Stay tuned! :)
Comment 4 Henrik Krohns 2019-08-09 14:53:44 UTC
Closing old stale bug. No point playing such whackamole with schemeless parser, there's million ways you can write a non-linkified url.