Bug 4753 - New encoding for listashing addresses, not ROT-13
Summary: New encoding for listashing addresses, not ROT-13
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (Eval Tests) (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-05 06:53 UTC by Loren Wilton
Modified: 2006-12-05 11:59 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Loren Wilton 2006-01-05 06:53:49 UTC
Matt Kettler did an analysis of the current common encoding format for email 
addresses.  This one is designed to bypass the simple ROT-13 check.  They 
encode into a ROT-17 *reversed* alphabet.

This isn't something that is trivially checkable with a standard rule, but 
should be reasonably trivial to catch in the eval code that does the current 
catch.

Matt sayeth:

Spammers have been embedding encoded versions of our email addresses in 
spam and web links for listwashing purposes for a long time.

One of the early popular encodings was a variant of rot-13.

More recently I've noticed a lot of the geocities exploit spams are using a 
new encoding.

Before:
mkettler@evi-inc.com
Encoded by rot-13 into:
zxrggyre^riv-vap(pbz

Now I'm seeing a lot using:
XZfQQYfS.fOb-bWh,hVX

Which I could tell was a simple character substitution, but not constant 
addition or XOR.

So I did some digging and got more examples from NANAS posts:
http://groups.google.com/group/news.admin.net-
abuse.sightings/msg/5724bf90fa6fae6e
http://groups.google.com/group/news.admin.net-
abuse.sightings/msg/ba468b2d26bb3494
http://groups.google.com/group/news.admin.net-
abuse.sightings/msg/570cfd7a517a598f

And built up an alphabet table. The results are amusing.

The new version takes a backwards alphabet, and reverse-rotates it 17 
characters.

Here's my table. I extrapolated the obvious for the items in ().
Plain -> encoded
------------

a       j
b       i
c       h
d       g
e       f
f       e
g       (d)
h       c
i       b
j       (a)
k       Z
l       Y
m       X
n       W
o       V
p       U
q       (T)
r       S
s       (R)
t       Q
u       (P)
v       O
w       (N)
x       M
y       L
z       K
------------

They're also using both upper and lower-case alphabets here, so you can 
continue the list at the top such that Z encodes to 'k'

Cute eh?

I might suggest encoding up your own email domains into a body rule.
Comment 1 Theo Van Dinter 2006-12-05 11:59:32 UTC
I don't think we're really going to be able to handle generic substitution
cipher detection as a spam rule. :|