SA Bugzilla – Bug 5802
Enhanced bare URI parsing: www(dot)prnceleb(dot)com
Last modified: 2019-08-09 14:53:44 UTC
It seems spammers are now trying to work around bare URI parsing by a number of tricks. One of those seems to be the use of (dot), as in www(dot)prnceleb(dot) com. Perhaps it would be worthwhile modifying the bare URI parser to treat (dot) as a period and see what the result is.
Having just totally rewritten the bare URI parser over the past week, I would rather first see the results of a sandbox rule that looks for how many strings like that are around to see if this is worth pursuing. I bet we don't even have to use the full TLD regexp that is in PerMsgStatus for this, just something like /\bwww(\s{0,4}\(dot\)\s{0,4}[a-z\d-]{1,30}){1,4}\b/i off the top of my head, and if there are a significant number of hits refine it by checking for a good TLD at the end.
Today I received a spam containing only the following text in its body: Let Your Creativity Flow - Experiment With Bikilni Waxnig Styles.www[dot]med22[dot]org It basically hit none of the Spamassassins own test even though www.med22.org is some kind of online drugstore Subject: spatialise X-Spam-Status: No, score=2.9 required=4.4 tests=BAYES_60,BOTNET, GREYLIST_ISWHITE autolearn=no version=3.2.5
Huh, there's a bug for this? It's been a recurring issue, with months of silence in between, usually died off quite quickly. Anyway, just catching static patterns, or things like (dot) doesn't cut it, as the recent wave of ever changing obfuscation has shown. Oh, yeah, and I actually have been working on this. Stay tuned! :)
Closing old stale bug. No point playing such whackamole with schemeless parser, there's million ways you can write a non-linkified url.