Bug 1727 - RFE: plugin for autowhitelisting of known PGP keys
Summary: RFE: plugin for autowhitelisting of known PGP keys
Status: RESOLVED WORKSFORME
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 enhancement
Target Milestone: 3.0.0
Assignee: Brett A. Thomas
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-04-01 11:56 UTC by Brett A. Thomas
Modified: 2004-03-11 06:04 UTC (History)
4 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Patch for CVS-current. Implementation dependent on gpg. patch None oysteigi@stud.ntnu.no [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Brett A. Thomas 2003-04-01 11:56:38 UTC
Since bug <a
href="http://bugzilla.spamassassin.org/show_bug.cgi?id=1680">1680</a> is
removing the nice rule of PGP_SIGNATURE as abusable, it ocurred to me that a
neat config option for PGP users would be to whitelist any message which is
signed by a key on a specified keyring (or the user's default keyring if we're
running as spamassassin and not spamc).

I can look at this when I get some time, or someone else can grab it if they
like.
Comment 1 Theo Van Dinter 2003-05-24 12:53:49 UTC
this would be a potential DoS by spammers, they could simply fake sign their
mails, and cause recipients a bunch of cpu time trying to validate.  but I'll be
happy to take a look at whatever you come up with. :)
Comment 2 Brett A. Thomas 2003-07-05 10:43:51 UTC
Sorry for the lack of updates, a startup and knee surgery don't leave a lot of
time for SA. :)

I understand your DOS concern.  My thought on it initially is to have this check
off by default so that only people who actually use PGP will be running it at all.

Realistically, for unknown keys, I suspect concerns about performance may not
hold up in the real world.  On my PIII-800 box, for perspective, "time
spamassassin -L < message" on an arbitrary 4k message takes 3.8 seconds.  "time
gpg --verify < known_key.asc" takes  .151 seconds, and "time gpg --verify <
unknown_key.asc" takes .009 seconds.  Under my proposed system, verification of
an unknown key would provide no positive benefit.  The idea is, if you've taken
the trouble to get someone's key on your keyring and you receive signed mail
from them, it's probably a real message.  So, if you enable this feature, a
signature from an unknown key is going to degrade SA's performance by something
like .2% if these numbers are correct (of course, it's a small sample set).

Also, it's worth noting, imagining a world in which enough people use this
feature that it becomes worth a spammer's trouble to try to DOS the system with
unknown keys -- generating a key is a LOT more CPU intensive than verifying a
signature.  This means they'd necessarily need to have a limited stable of keys.
 I'm not sure exactly how the pipeline in SA works, but it seems conceivable to
me that if we could detect a known spammer's signature in .15 seconds and drop
execution at that point, it'd be an overall performance win.

Well, I need to go look into actually implementing this, now...
Comment 3 Justin Mason 2003-07-23 10:44:47 UTC
BTW, some relevant comments from David Corbett:

Hi Justin,

Pre-script:
I understand & respect your concerns, but the problem is not nearly as big as
you think.  Also, in the long term, I think that a gpg interface tool would be
very precious.  In my response, below, I first outline how I would solve the
problem and then why I want this so much.  (Unfortunately, I regret that the
coding is beyond my technical ability.)

On Tuesday, Jul 22, 2003, at 16:23 Canada/Eastern, Justin Mason wrote:

D> Unfortunately, the task is over my head; or I'd offer to write a call to "gpg
--verify".  SO, I guess that I must be patient; but I look forward to that call.

J> That's the issue alright -- then there's key distribution :(

NO -- let gpg do all this work.

Set this up as a module dependent on gpg.  For those who enable that code ... 
SA can issue a shell call (gpg --verify) with the unknown message on stdin.  Gpg
will return the authentication status on stdout.  Encryption happens before
signature, so you never need access to the user's private key.  (This does not
work if PGP is installed instead of gpg.  The proprietary version cannot be
called from a shell.)

J> For now, the spammers are just pasting in a key at random and not using the
correct format for a PGP signed message.  But if we add the rules, I have no
doubt they'll start forging it more effectively.

In order to pass this test, the spammer would need to generate & upload a real
key to the public servers.
  (a) If I permit gpg to download unknown keys automatically then gpg would
return "valid signature from invalid key," since the spammer's key would not
have been signed by anyone in my keyring that I trust.
  (b) If I do not permit automatic downloads, then verification fails & gpg
return "invalid signature from unknown key"; which most people would wish to
call spam.

Regardless, an appropriate score can be assigned to the various results returned
by gpg.

J> At that point we'd have to figure out a way to verify the keys -- then
spammers can generate a key -- then we call out to the keyservers to ensure the
key is valid -- then they register one-use keys -- then we come up with a
blacklist of "spammer keys" -- then...etc.  you see the problem.

Nope.  Not your problem.  One-use keys are not overly useful in the world of pgp
signatures because they are not signed by anybody the recipient knows.  They
will always be invalid until a manual override by the recipient changes that.

J> Hence for now we're avoiding that ;)

The reason that I like PGP as a spam detection method is that it can, in some
cases, support positive identification of legitimate mail from an unknown party.
 Let us define two people  --  Bar is known to me & trusted by me to verify his
PGP keys carefully.  Foo is totally unknown to me, but known well by Bar.

  - Unknown to me, Foo sends me an email and signs it with PGP.
  - SA detects a possible signature & calls gpg to verify it
  - gpg detects an unknown key & downloads it
  - this key has been signed by Bar, who is trusted by me; so the new key is
immediately validated
  - gpg then returns "valid signature from *valid* key"  (emphasis mine)
  - the message from Foo should now be given -100 (or whatever).

My biggest fear regarding a gpg interface module, though, would be a denial of
service attack consisting of a mass of fake PGP messages.  Encryption/decryption
is resource expensive relative to sendmail.

# Brett Thomas partially addresses this in Comment #2 to bug 1727. (above)
# However, the lookup for an unknown key would still be very expensive relative
to SA.

J> I would say though that as a local whitelist technique, it should work fine.
 It's just if we put it in the default ruleset this would > happen.

But whitelisting does not allow me to detect email from an unknown party who has
legitimate "credentials."

Comment 4 Bob Menschel 2003-07-27 09:20:04 UTC
I agree that this should be a local.cf / user_prefs option, default off, since 
there's no use considering this action for sites/users that don't have PGP/GPG. 
It should be something that can be turned on in user_prefs, since even on sites 
with PGP/GPG, only a few users are likely to have a PGP/GPG keyring set up. 

Perhaps we could make this a multiple-choice option: 
ENABLE_GPG 0 -- disable GPG activity (default)
ENABLE_GPG 1 -- enable GPG activity after all other checking, only if email is 
flagged as spam (score >= required_hits). The purpose would be to identify good 
signatures and add a negative score. 
ENABLE_GPG 2 -- enable GPG activity after all other checking, only if email is 
not flagged as spam (score < required_hits). The purpose would be to identify 
bogus signatures and add a positive score.
ENABLE_GPG 3 -- enable GPG activity regardless of score. 
This set of options would give the user maximal control over the resource hit 
on their system. 

Once enabled, I see four possible test results: 
* PGPSIG_KNOWN_KEY -- valid signature, known key -- negative score
* PGPSIG_UNKNOWN_KEY -- valid signature, unknown key -- slight positive or 
negative, according to our needs.
* PGPSIG_INVALID_KNOWN -- invalid signature, but known key -- likely an email 
client problem.
* PGPSIG_INVALID_UNKNOWN -- invalid signature and unknown key -- probably worth 
a positive score.
Again each person could set their own scores, with defaults generated through 
SA's normal methods. 

I expect we'd need volunteers to send in PGP-signed ham and PGP-"signed" spam 
for the public corpus so the defaults could be determined. If/when you're ready 
for this I can scan my personal corpus for samples for this. (To avoid the 
PGP/GPG overhead during default score calculations, maybe we should have some 
kind of header flag indicating what the results should be?)


 

Another possible option: If the PGP/GPG tests are enabled by the above option, 
then 
Comment 5 Bob Menschel 2003-07-27 09:24:10 UTC
Sorry for that cut-off.  Another option to consider, given the resources 
required for unknown key retrieval from PGP servers -- that option could also 
be turned on/off by the user (default off). For most purposes, "not on my 
keyring" is good enough, and "retrieve key from public server to see if it is 
signed by anyone I trust" will turn up a "yes" response rarely enough for most 
of us to not want to bother with it. 

Comment 6 oysteigi 2003-11-06 12:18:54 UTC
The outlined strategy could work for PGP signed messages, but not for encrypted
and signed messages. As far as I know, encryption is applied after signing, and
one will need to successfully decrypt the message before even attempting to
verify it. Decryption is an interactive task so I guess this is a limitation.

Support for different PGP verification clients is also an issue. I guess gpg is
priority one. Should the gpg-command be specified in a traditional SA-way with
hard-coded parameters like
    gpg_path /usr/bin/gpg
or mutt-style like
    gpg_command /usr/bin/gpg --no-verbose --quiet --batch --output
? The path itself can, of course, be dropped in both cases.

The parsing of the message into client readable parts will be a major part of
the job here. Regarding this, is conformance with RFC2015 satisfactory?
Comment 7 oysteigi 2003-11-11 11:05:09 UTC
Created attachment 1549 [details]
Patch for CVS-current. Implementation dependent on gpg.

I made an attempt to make a naive, testing-only implementation of this, just to
see how usable it is. This is my first few lines of perl code ever, and I've
never used cvs diff or patch before, please, be patient with me if things
aren't working as they should.

The code is dependent on gpg and /dev/null, so I guess you'll have to run on
some kind of *NIX box. I've tested it on linux-i686 and sun-solaris.
To enable the test, insert the following into user_prefs:

body PGP_VERIFIED_SIG		eval:check_pgp_verified_sig()
describe PGP_VERIFIED_SIG	Message's PGP signature verified
score PGP_VERIFIED_SIG		1

I hoped to be able to differ between syntactically incorrect signatures,
unknown keys, non-matching signatures and correct signatures, but that seemed
difficult without scanning the output from gpg.

Using Crypt:OpenPGP (http://www.stupidfool.org/perl/openpgp/) is a more ideal
solution anyway, I think. As far as I can see, it only differs between correct
and wrong signatures.
Comment 8 Justin Mason 2003-12-02 16:10:40 UTC
also possible -- using Crypt::OpenPGP:
http://www.sixapart.com/log/2002/12/verifying_pgp_s.shtml

However, installing that module is turning out to be hell on earth.
It has about 50 prerequisites!  So running "gpg" as a helper
app is probably a better idea.
Comment 9 Bob Menschel 2003-12-02 18:09:03 UTC
Data for testing if appropriate/useful -- I received spam today that appeared 
to be signed with GNUPG. Below are headers, then PGP evaluation of the 
signature, then body: 

Return-path: <xeperorg@server14.arteryserver14.net>
Envelope-to: rmenschel_sa@xeper.org
Delivery-date: Tue, 02 Dec 2003 14:35:21 +0000
Received: from xeperorg by server14.arteryserver14.net with local-bsmtp (Exim 
4.24)
        id 1ARBcP-00019e-5i
        for rmenschel_sa@xeper.org; Tue, 02 Dec 2003 14:35:21 +0000
Received: from xeperorg by server14.arteryserver14.net with local (Exim 4.24)
        id 1ARBcP-00019K-2t
        for rmenschel_sa@xeper.org; Tue, 02 Dec 2003 14:35:13 +0000
Received: from xeperorg by server14.arteryserver14.net with local-bsmtp (Exim 
4.24)
        id 1ARBZn-00011v-8g
        for rmenschel@xeper.org; Tue, 02 Dec 2003 14:32:32 +0000
Received: from [62.77.215.52] (helo=mailcity.com)
        by server14.arteryserver14.net with smtp (Exim 4.24)
        id 1ARBZi-00010h-6g
        for balfaq.maquino@xeper.org; Tue, 02 Dec 2003 14:32:29 +0000
Message-ID: <30cf01c3b8cd$08c5c0d0$42c38534@blkhm>
Reply-To: <tinderbox6@mailcity.com>
From: <tinderbox6@mailcity.com>
To: "Miles" <balfaq.maquino@xeper.org>
Subject: she is way too good for him, t t
Date: Tue, 02 Dec 2003 05:08:45 -0700
MIME-Version: 1.0
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
User-Agent: AOL for Macintosh sub 147
X-SAEU: passed anti-spam tests
Resent-To: rmenschel_sa@xeper.org
Resent-Message-Id: <E1ARBcP-00019K-2t@server14.arteryserver14.net>
Resent-From: xeperorg@server14.arteryserver14.net
Resent-Date: Tue, 02 Dec 2003 14:35:13 +0000
X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on 
        server14.arteryserver14.net
X-Spam-Status: No, hits=1.5 required=9.0 tests=NO_REAL_NAME,PRIORITY_NO_NAME,
        RCVD_IN_SORBS autolearn=no version=2.60
X-Spam-Level: *



*** PGP SIGNATURE VERIFICATION ***
*** Status:   Bad Signature from Invalid Key
*** Alert:    Signature did not verify. Message has been altered.
*** Alert:    Please verify signer's key before trusting signature.
*** Signer:   joker.com signer service (This key is used to sign outgoing 
messages only.) <info@joker.com> (0x1144E223)
*** Signed:   11/21/2003 8:45:55 PM
*** Verified: 12/2/2003 6:04:55 PM
*** BEGIN PGP VERIFIED MESSAGE ***


Hi,
your request for notification for the ZONE (sites i need to see)of
  "l3"
has been scheduled.

You will get another notification shortly after your request has been
processed.
If you do not get any further notification during the next 24 hours,
don't hesitate to contact our support team.


She is way too fine for this ugly dude:
http://gnome30.route.antipuff.nom.br/?f=5666/dm_ff.htm


In case of any questions regarding your request please visit our 
support area
on http://hdhj3.com



Best regards,

your ca5-Team


--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
h0.com is a division of
The cheese factory
48 N Peko Lane
Arka, TX
65002

https://gbof7.net
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

*** END PGP VERIFIED MESSAGE ***



Comment 10 oysteigi 2003-12-03 03:13:19 UTC
I implemented signature verification with Benjamin Trott's Crypt::OpenPGP a
couple of weeks ago using medium-level access methods (classes KeyRing, Message,
Certificate++).

It seems to work fine except for a couple of issues. When running inside SA, it
has speed problems (~40s for one signature on my system), while it runs smoothly
in a separate script. I'll look over my code when I find time for it.

I like the idea of using an existing Perl package instead of gpg, but the larg
number of dependencies is a problem. Since we never sign or encrypt anything, we
only need a few of them in theory. Is it an option to implement part of it by
ourselves so that we can reduce the number of dependencies?
Comment 11 Martin Pool 2004-01-28 20:19:34 UTC
I'd very much like to see this.

I don't think there is any need to automatically fetch GPG keys, and this is
where there is the greatest potential for a DoS.  Automatically fetching keys,
and giving them a negative score would strongly encourage spammers to inject
false keys into the keyservers.  I think it would be irresponsible of SA
developers to do that.

Generating valid signatures will be quite computationally expensive for
spammers, especially if they don't want to send identical messsages to many
people and fall foul of things like Razor.

I would like to refine the levels in David's message a little bit:

- Not a valid signature.  This is what most fake-GPG spam looks like; it's not
gpg at all but just includes the keyphrase.  Strong positive score.

 - Valid signature from unknown key.  Don't try to download the key.  Zero score
change.  Perhaps you could make it slightly negative, up until spammers start
trying this.

 - Valid signature from *valid but untrusted key*.  Typical of someone fairly
far from me in the web of trust; I have their key but can't be sure it's really
them.

 - Valid signature from trusted key.  I strongly know their identity.  I can be
absolutely sure this isn't a spammer.  -100.

The last of these has the most discriminative power and might be the most useful
to add first.  Essentially it is a non-forgeable semi-automatic whitelist.  If I
can get friends to sign their mail, it would make sure their mail is never dropped.

It might be possible to check for signatures from trusted keys quite quickly,
just by determining the signing key and seeing if it's on our list before doing
any actual verification.

The cause of long delays might be (attempted) trustdb updates.  It might be good
to set the --no-auto-update-trustdb option.

Here is a possible attack scenario: spams prepend their message to a copy of a
valid GPG-signed message (from CERT, say).  If SpamAssassin naively checked
"yes, there's a valid signature here" then those messages would pass.  So you
would need to either require the whole message to be signed, or check the From
address.

Sometimes when GPG messages go through a mailing list, extra non-signed bits get
tacked on, such as the unsubscribe footer from Mailman.  So I think SA would
need to allow for messages that are not completely signed.

But then what happens if the spammers forge their from address to be CERT,
include a valid gpg-signed message from CERT,  but put viagra ads at the top? 
How do you distinguish this from a valid gpg-signed message that went through a
Mailman list?
Comment 12 Justin Mason 2004-02-08 22:01:40 UTC
BTW, everything should be in place at this stage in the 2.70 tree to implement
this as a plugin. ;)
Comment 13 Justin Mason 2004-02-18 16:40:02 UTC
changing title
Comment 14 Theo Van Dinter 2004-03-11 15:04:56 UTC
ok, since we're not going to have standard code for this, I'm going to close the ticket as wfm.  if 
someone makes a plugin for it, we'll be happy to point to it from the wiki plugin repository. :)