SA Bugzilla – Bug 4753
New encoding for listashing addresses, not ROT-13
Last modified: 2006-12-05 11:59:32 UTC
Matt Kettler did an analysis of the current common encoding format for email addresses. This one is designed to bypass the simple ROT-13 check. They encode into a ROT-17 *reversed* alphabet. This isn't something that is trivially checkable with a standard rule, but should be reasonably trivial to catch in the eval code that does the current catch. Matt sayeth: Spammers have been embedding encoded versions of our email addresses in spam and web links for listwashing purposes for a long time. One of the early popular encodings was a variant of rot-13. More recently I've noticed a lot of the geocities exploit spams are using a new encoding. Before: mkettler@evi-inc.com Encoded by rot-13 into: zxrggyre^riv-vap(pbz Now I'm seeing a lot using: XZfQQYfS.fOb-bWh,hVX Which I could tell was a simple character substitution, but not constant addition or XOR. So I did some digging and got more examples from NANAS posts: http://groups.google.com/group/news.admin.net- abuse.sightings/msg/5724bf90fa6fae6e http://groups.google.com/group/news.admin.net- abuse.sightings/msg/ba468b2d26bb3494 http://groups.google.com/group/news.admin.net- abuse.sightings/msg/570cfd7a517a598f And built up an alphabet table. The results are amusing. The new version takes a backwards alphabet, and reverse-rotates it 17 characters. Here's my table. I extrapolated the obvious for the items in (). Plain -> encoded ------------ a j b i c h d g e f f e g (d) h c i b j (a) k Z l Y m X n W o V p U q (T) r S s (R) t Q u (P) v O w (N) x M y L z K ------------ They're also using both upper and lower-case alphabets here, so you can continue the list at the top such that Z encodes to 'k' Cute eh? I might suggest encoding up your own email domains into a body rule.
I don't think we're really going to be able to handle generic substitution cipher detection as a spam rule. :|