SA Bugzilla – Bug 796
add hashcash support
Last modified: 2003-12-14 15:23:05 UTC
Tracking ticket for adding support for hashcash to SpamAssassin. For more information: http://www.camram.org/ www.cypherspace.org/~adam/hashcash/ I'm not sure, but we might want to avoid using external software if possible (aside from Digest::SHA1). It also looks like rule/score ranges are desireable since cash value varies (age, number of recipients, length of bit-sequence, etc.)
Tony L. Svanstrom is working on this ticket.
good luck Tony! this would be cool. refiling as enhancement ;)
Subject: Re: [SAdev] add hashcash support On Wed, 4 Sep 2002 the voices made bugzilla-daemon@hughes-family.org write: > ------- Additional Comments From jm@jmason.org 2002-09-04 06:29 ------- > good luck Tony! this would be cool. refiling as enhancement ;) I'm aiming at the next major release, mostly due to not having as much time as I'd like to work on this. The major problem isn't getting the code to work though, it's checking the format of the "tags" and current implementations IRL. /Tony
How is this progressing? I'd like to add support for this now unless someone is already working on it.
Subject: Re: [SAdev] add hashcash support On Sun, 29 Sep 2002 the voices made bugzilla-daemon@hughes-family.org write: > http://www.hughes-family.org/bugzilla/show_bug.cgi?id=796 > ------- Additional Comments From jas@extundo.com 2002-09-29 07:07 ------- > How is this progressing? I'd like to add support for this now unless someone is already working on it. If you were to add it right now then it'd just be "hashcash a la SA", because there's no accepted standard; which more or less makes it useless as a part of SA. shird@dstc.edu.au is working on it though, and seems to be part of all the standardtalking that's going on... /Tony
For further information on getting HashCash into SA, take a look at the CamRam mailing list (http://www.camram.org, or at news.gmane.org : gmane.mail.spam.camram). What is needed from the SA side of things, is a way to pass the intended recipient onto the test. Ive already got a rough implementation working, but as Tony said, it doesnt conform to any set standard, because there isnt really any.
Subject: Re: add hashcash support bugzilla-daemon@hughes-family.org writes: > For further information on getting HashCash into SA, take a look at the CamRam > mailing list (http://www.camram.org, or at news.gmane.org : > gmane.mail.spam.camram). Yup. Gnus supports these headers, and I have been sending them for a while. > What is needed from the SA side of things, is a way to pass the > intended recipient onto the test. You mean the SMTP envelope destination? I'm not sure this will work, from what I hear how this should work on mailing lists is that you mint a coin for the mailing list itself. This has some problems, but I think they are lesser than the problems in having the mailing list server mint coins for each mailing list member (which is infeasible even for a small number of members). Using the contents of the To: header seems like a better solution to me. > Ive already got a rough implementation working, but as Tony said, it > doesnt conform to any set standard, because there isnt really any. Adam Back has a tool that generates X-Hashcash: headers, is there a problem in using that format? Most of what SpamAssassin does is not formally standardized, so I don't understand why adding pragmatic support for existing headers (X-Hashcash:) that prevents spam would hurt.
The problem of using the 'To:' header is it is easily forged. I could send out a million messages to different people all with the same 'To:' header and use the same token for each person. It is ideal to use the 'To:' header, but you also need a way to verify that it is acceptable. I have done up a quick implementation attached to bug 1041, in which you are able to make a list of 'valid_to' addresses/resources. I havent tested it, but I imagine you could have something like: valid_to john*@spamgourmet.com and it would match on 'john.3.random@spamgourmet.com' (if you understand how spamgourmet works). Another approach is to use 'automatic To: learning' (or something) as noted by Daniel in bug 1041. This seems like the ideal approach. The idea of using the 'X-Envelope-To' header doesnt appeal to me much, because MTAs all use different headers, or none at all, which would require you to tailer SA to your MTA/MDA, and make sure it strips out any forged ones it finds. It also doesnt allow you to make use of a forwarding service such as spamgourmet, because the envelope-to would differ from the 'To:' header, which would be used for the token and may be acceptable.
BTW: The implementation I have included with bug 1041 is compatible with Adam Back's hashcash tool (albeit not fully implemented). I also used 'X-HashCash:' instead of 'X-Hashcash' (uppercase 'C'). But it needs to be mostly re-written anyway. Also, concerning mailing lists, this also allows you to use something like: valid_to bugtraq_list And bugtraq could then send out mail using tokens which are generated against this resource ('bugtraq_list'). You may also want to include the value required for that resource. A better name may be 'valid_resource'. What you could then do is use an 'automatic valid token resource learning' technique, instead of learning valid 'To:' fields. This could only be done once hashcash becomes common place though. Is anyone working on an 'automatic To: learning' type system? Is it already implemented and I dont even know about it? :P
Subject: Re: add hashcash support shird@dstc.edu.au writes: > The idea of using the 'X-Envelope-To' header doesnt appeal to me > much, because MTAs all use different headers, or none at all, which > would require you to tailer SA to your MTA/MDA, and make sure it > strips out any forged ones it finds. It also doesnt allow you to > make use of a forwarding service such as spamgourmet, because the > envelope-to would differ from the 'To:' header, which would be used > for the token and may be acceptable. Envelope headers (such as exim's "Envelope-to:") are the best way to handle this (well, part of the best way -- see below). Everything else is a non-optimal solution entailing configuration maintenance or heuristics. It's trivial to support the envelope header for the most commonly used MTAs (sendmail, exim, postfix, qmail, etc.). Forgery is impossible. The MTA strips out any present envelope header and adds its own. In addition, the final (top-most) Received header is always added by the last receiver and will indicate the MTA in use. In addition (this is the other part), I neglected to mention that the final Received header also often includes the envelope header address: Received: from relay07.indigo.ie ([194.125.133.231]) by proton.pathname.com with smtp (Exim 3.35 #1 (Debian)) id 17yexF-0005Sk-00 for <quinlan@pathname.com>; Mon, 07 Oct 2002 13:58:17 -0700 I'd probably try for that first since it's most common and easiest. Then, fall-back on MTA-specific headers. Yes, if you use a forwarding service, then you will have to rely on a heuristic or configuration setting, but that is not an argument against using the envelope header when it is present.
Subject: Re: add hashcash support bugzilla-daemon@hughes-family.org writes: > The problem of using the 'To:' header is it is easily forged. I > could send out a million messages to different people all with the > same 'To:' header and use the same token for each person. > > It is ideal to use the 'To:' header, but you also need a way to > verify that it is acceptable. I have done up a quick implementation > attached to bug 1041, in which you are able to make a list of > 'valid_to' addresses/resources. I havent tested it, but I imagine > you could have something like: > > valid_to john*@spamgourmet.com This seems like a good approach. I'll see if I can get your implementation to work... > The idea of using the 'X-Envelope-To' header doesnt appeal to me > much, because MTAs all use different headers, or none at all, which > would require you to tailer SA to your MTA/MDA, and make sure it > strips out any forged ones it finds. It also doesnt allow you to > make use of a forwarding service such as spamgourmet, because the > envelope-to would differ from the 'To:' header, which would be used > for the token and may be acceptable. In general, using the envelope address has similar problems as using the To: header -- I receive mail for many SMTP envelope destinations, but I only consider one or two my "real" mail address, which is what I would put in "valid_to" or " valid_resources" or whatever it is called. But using X-Envelope-To or parsing Received: lines doesn't seem all that ugly, considering the amount of guessing SA already does. Of course, a real solution could be to add a command line parameter or a spamd protocol part that tells SA what the real envelope address is. If the envelope address parsing works, I think SA should default to scoring up mail with hashcash cookies minted for the SMTP envelope address.
3 points: 'If the envelope address parsing works, I think SA should default to scoring up mail with hashcash cookies minted for the SMTP envelope address.' Sounds good -- and if valid_to is set, then use that instead. BTW I have comments in EvalTests (iirc) detailing which -To headers are added by which MTAs. I nicked it from a procmail page somewhere ;) So that should provide guidelines for other headers to look at as well as Envelope-To. In addition, I strongly support extending the spamc and spamassassin switch-set to allow the MTA to indicate to Mail::SpamAssassin what the user's envelope address was, via a command-line switch. We have something like this now, "-u", but that only acts for the *username*; in a virtual-hosting env, jm@jmason.org could be a totally different user from jm@whatever.com. (I think there's another bug in the zilla somewhere, where someone's asked for this to be supported for virtual-configuration stuff in spamd as well.)
"and if valid_to is set, then use that instead" valid_to, at least the way I have done it, allows you to specify several acceptable addresses, so may want to use it additionally rather than exclusively. Although if I'm wrong below, I could see how some people may want to use it that way. "In general, using the envelope address has similar problems as using the To: header -- I receive mail for many SMTP envelope destinations" I think it is quite rare/impossible that you would share an envelope address with someone else (correct me if Im wrong) - so it should be unique to you, and therefore should be ok to use in checking the token. The way Ive done it checks a list of acceptable resources, so if one doesnt match, it tries the others in the list. So there shouldnt be any problem with using the envelope dest if it is available. Keep in mind that what people use for generating the token will typically be the address they put in the 'To:' (or CC: etc) header. "We have somethinglike this now, "-u", but that only acts for the *username*; in a virtual-hosting env, jm@jmason.org could be a totally different user from jm@whatever.com." Because you have the -u switch, and/or spamc/d runs as the dest user, you could also look that user up in a DB (rather than using user_prefs) to check for valid addresses. (Doesn't the AWL work like this when it doesnt have access to a users $HOME?). I havent played with the SQL features of SA, but I will try take a look and see if it can be extended to support this.
Subject: Re: [SAdev] add hashcash support > ------- Additional Comments From shird@dstc.edu.au 2002-10-09 18:16 ------- > I think it is quite rare/impossible that you would share an envelope address > with someone else (correct me if Im wrong) Some use a "domainwide" mailbox, and then they use a fetchmail(ish) solution to get the mail and distribute it locally; I've seen all kinds of weird solutions like that, with all kinds of local headers used/not used. /Tony
"Some use a "domainwide" mailbox, and then they use a fetchmail(ish) solution" That sounds pretty kludgey, but given that, I assume it would be difficult to predict the envelope-to address to be able to abuse/use it - and would be rare enough that spammers wouldnt rely on it. So it should still be ok to use. Legit users also wouldnt be able to use this address, so it highlights the need to have an additional list of valid to addresses - or some other way of determining valid addresses. Justin Wrote: "BTW I have comments in EvalTests (iirc) detailing which -To headers are added by which MTAs." If you are going to extract the env address from the headers, you would also need a way to tell SA which headers to use - you couldnt just check for all of them because they could be forged.
"We have somethinglike this now, "-u", but that only acts for the *username*; in a virtual-hosting env, jm@jmason.org could be a totally different user from jm@whatever.com." I see what your getting at here. I guess you could have a global list of acceptable domains, then try the combinations of jm@$domain to see if one matches. Theoretically this would allow you to spam all jm's at different virtual domains - but thats not much of an issue (except for maybe 'webmaster'). Also, my username may be 'shane', but my envelope addresses are different - for a lot of people it is the same though.
Is anyone progressing with this?
it sounds like there's not a lot of work going on about this. so I'm setting the target milestone to 3.0.
Subject: Re: add hashcash support bugzilla-daemon@hughes-family.org writes: > it sounds like there's not a lot of work going on about this. so I'm setting > the target milestone to 3.0. Shane Hird did some work [1] with this, wouldn't it be possible to add that to SpamAssassin? Even if it is not a Internet Standard (which I thing it never will be) it will prevent spam. [1] http://www.camram.org/mhonarc/spam/msg00550.html, http://bugzilla.spamassassin.org/show_bug.cgi?id=1041
Subject: Re: [SAdev] add hashcash support On Fri, 3 Jan 2003 the voices made bugzilla-daemon@hughes-family.org write: > ------- Additional Comments From jas@extundo.com 2003-01-03 15:50 ------- > Subject: Re: add hashcash support > > bugzilla-daemon@hughes-family.org writes: > > > it sounds like there's not a lot of work going on about this. so I'm setting > > the target milestone to 3.0. > > Shane Hird did some work [1] with this, wouldn't it be possible to add > that to SpamAssassin? Even if it is not a Internet Standard (which I > thing it never will be) it will prevent spam. It's not worth it, there's no standard and not many are using it, so it's just a waste of resource... /t
Tony -- our position (well me and Theo at least) is that if we can encourage use of an auth system (like hashcash), we should. At least it would mean our mails would be whitelisted ;) I think we could get this in once 2.60 starts up. What I propose is this. - we incorporate "valid_to" from bug 1041. Rename it to "hashcash_resource_accept", since it's impl-specific and I don't want to require valid To addresses for any other tests. We *do* need this for hashcash, AFAICS, since (a) some versions of sendmail do NOT leave the envelope To addr in the mail message anywhere :( and (b) consider the "spamgourmet" case, where the user wants to accept tokens for resources on other machines with (possibly) totally different names, since they know those addrs forward to the current addr. - we also figure out the envelope To addr using code we have in EvalTests already. This just needs to be abstracted a little. Then we can use the envelope-To as the default setting for "hashcash_resource_accept". - Shane -- what's the current situation with Hashcash usage and "de-facto standardization"? ;) Is there a "safe" protocol that will probably work? - Also, I'll need a patch, none of this "copies of files" rubbish ;)
Note the Shane Hird code has one problem: it is using 111111111...111b as the challenge, and the hashcash library code uses 0000000...000b as the challenge. The 0000000...000b is the correct string. Adam
moving a bunch of bugs to 2.70 milestone
I was at a conference this week (talking about Spam) and ended up talking to David Harris who is the author of the popular Windows mail client Pegasus Mail ( http://www.pmail.com/ ) . I told him about hashcash and he was interested in the idea (with various reservations) but I'm a bit dubious about recomending he use it unless it's going to be supported elsewhere. It would appear to me that the basic systems as defined by Adam Back is pretty good and has the virtue of being simple to implient and use. The big problem would seem to be at the double spend level. I'm hoping that double spends can be blocked by a combination of a local database and a hook into DCC or similar. It would also appear to be straightforward for me (as an ISP admin) to add a little program to append this header on outgoing email from customers (up to their first 20 emails per day). In that case it might be generated with postmaster@ihug.co.nz as the "resource" rather than the From address however. From spamassassin we could initually do something like below to start with: header HASHCASH_20 eval:check_hashcash('20') describe HEADCASH_20 Hashcash match 20 bits or more score HASHCASH_20 -3.0 header HASHCASH_LOCAL eval:hashcash_db_lookup() describe HASHCASH_LOCAL Hashcash already in local database score HASHCASH_LOCAL 6.0 What do people think? Hashcash appears to be a good idea but needs to initual push, Pegasus mail and a simple script to generate the headers by MTAsalong with spamassassin l,oooking for them would seem to be a good push. A inital local collision database and later hook into DCC or Razor will ensure that spammers won't be able to wholesalely forge it without be detected at the vast majority of sites. (a) Is it safe to recomend the: X-Hashcash:0:030713:simon@darkmere.gen.nz:911d15251bc1a8a5 format? (perhaps with a space at the start of the header). (b) Do people feel a one week expire time for hashcash headers is good? This would enableMUAs/MTAs to create them in advance but ensure that that databasess like DCC don't have to remember them forever.
simon.lyall@ihug.co.nz wrote: > [...] > It would also appear to be straightforward for me (as > an ISP admin) to add a little program to append this > header on outgoing email from customers (up to their > first 20 emails per day). In that case it might be > generated with postmaster@ihug.co.nz as the > "resource" rather than the From address however. Minor nit: the resource should be the recipient's address. So postmaster at sending site would in that context not be accepted by the recipient as he's expecting an address that he's willing to accept mail for in the resource string. (He can only accept at his own addresses or people can re-use tokens by sending the same token to multiple people). > Is it safe to recomend the: > > X-Hashcash:0:030713:simon@darkmere.gen.nz:911d15251bc1a8a5 > > format? (perhaps with a space at the start of the header). The missing space was a bug introduced in hashcash-0.26, it was fixed in 0.27, so if you get that or 0.28 it should be back to as before. > (b) Do people feel a one week expire time for hashcash > headers is good? This would enableMUAs/MTAs to create > them in advance but ensure that that databasess > like DCC don't have to remember them forever. it's a tradeoff between storage and reliability. If mail happens to get delayed for longer than 1 week, it loses it's hashcash scoring as the hashcash will be considered invalid as "expired". I had suggested 28 days (that was introduced as a default in one of the more recent versions). But it's a matter of taste and how you interpret mail delivery semantics plus a fudge factor. I would think the storage costs of a hashcash token per received mail ought to be fairly low compared to the other things an MTA is storing but I don't have the stats to back that up. I'm not sure about the software framework so don't know what DCC is, but couldn't one just plug a berkeley db in under hashcash tools DB layer, or if using the perl hashcash module, use perls tied hashes to store in a berkeley db?
BTW, I think it might be worth getting this in, using: - "hashcash_resource_accept" = resource to accept hash-cash tokens for. For now, that should be just an email address or list of addresses, same kind of semantics as "whitelist_to" or similar (including the globbing! I want to receive all addrs @jmason.org, for example.) - X-Hashcash: header should always have a space after the first colon - one-week expiry time, at the most, and no DCC server; just a local .db file, in "~/.spamassassin/hashcash_seen". The DCC server is not necessary, if we're using the hashcash_resource_accept idea above, since an incoming token will not be accepted by another server unless the resource matches.
Justin Mason wrote: > "hashcash_resource_accept" = resource to accept hash-cash tokens for. > For now, that should be just an email address or list of addresses, > same kind of semantics as "whitelist_to" or similar (including the globbing! > I want to receive all addrs @jmason.org, for example.) Sounds good, I'd probably configure my servers to accept any token, since with thousands of users I can't determine what domain/address the customer might be receiving email addressed to (after multiple forwards, BCCs, emails lists etc, some outside my control). > one-week expiry time, at the most, and no DCC server; just a local .db file, > in "~/.spamassassin/hashcash_seen". The DCC server is not necessary, > if we're using the hashcash_resource_accept idea above, since an incoming > token will not be accepted by another server unless the resource matches. Local db file sounds good. It'll do 90% of the job a distributed one would with 10% of the work. If people start swapping hashes later then support can be added.
update: new CPAN module: use Digest::Hashcash; $prefix = $cipher->verify($token [, param => value...])) Checks the given token and returns true if the token has the mini- mum number of prefix bits, or false otherwise. The value returned is actually the number of collisions, so to find the number of col- lisions bits specify "collisions => 0". Any additional parameters are interpreted the same way as arguments to "new". This looks very doable for 2.70 ;)
Here's a potential problem with the idea of penalizing duplicate messages using the same stamp; consider a message From: foo@example.com To: jm@example.com, spamassassin-talk@Example.com X-Hashcash: [valid token for jm@example.com] where "spamassassin-talk" is a list of which "jm" is a member. "jm" will get two copies -- one via SMTP from foo's server to jm's server, and later, one via SMTP from the sa-talk server to jm's server, possibly even with modifications to the body. If we were to simply record hashcash tokens to block replays, we'd wind up penalizing the list mail since it bears the same token. Suggestions? (preferably suggestions that don't involve taking a hash of the mail headers and parts of the body ;)
Yes suggestion: both of the addresses should have a X-Hashcash header. So spend only one of them if there are two that you are willing to accept for. (There is one header per recipient, not one per mail; see for example the emacs client behavior). (ie if you are on the above mentioned mailing list your hashcash_resource_accept string will include that address also.) Adam
> (ie if you are on the above mentioned mailing list your > hashcash_resource_accept string will include that address also.) I don't think that'll work, unfortunately. Prior experience has shown that requiring users to manually list their mailing list subscriptions, as would be required here, will not fly -- it's just too much work to do. Also, SA's approach is to require minimal hand-configuration... We can work around by not penalizing double-spending, instead just ignoring the double-spent token. At least that way the spammer gets no benefit.
Sure you could do that as a default for people who do not put their list subscriptions in. If I understand you're saying: if the stamp is not for an address I receive mail as pretend it's not there for scoring purposes. People who do fill in the full set of what they receive mail as will get better accuracy and lower risk of false positives for their mailing list traffic. But I'm not sure why the token would be double spent. Let's say (your example, except I put in the missing token which a hashcash client would put in, one per recipient): From: foo@example.com To: jm@example.com, spamassassin-talk@Example.com X-Hashcash: [valid token for jm@example.com] X-Hashcash: [valid token for spamassassin-talk@example.com] And jm is lazy and doesn't bother with adding his list subscriptions. That means spamassassin infers his address as jm@example.com, so hashcash_resource_accept = jm@example.com spamassassin takes the first stamp and marks it as double-spent. The 2nd token is ignored. The mail will get delivered twice. Whichever arrives first will result in the hashcash stamp for jm@example.com being considered spent. Whichever arrives 2nd will get no +ve (or -ve) scoring from the token. If the user sets hashcash_resource_accept to: hashcash_resource_accept = jm@example.com,spamassassin-talk@example.com both copies of the mail will get +ve scoring from the corresponding hashcash stamp. If the user sets hashcash_resource_accept to: hashcash_resource_accept = *@example.com again they'll both be considered valid. (Take only the 1st valid stamp and consider it double spent. Rule: 1 stamp per received copy of a mail.) btw you might want something more mnemonic than resource -- eg hashcash_my_addresses or hashcash_valid_addresses_for_me. Whatever you think. The "resource" terminology from hashcash is generalised beyond email addresses to web pages, web servers, IP addresses etc to protect any "resource", and I've found people find confusing as applied to email. Adam
(add me to Cc list.)
> The mail will get delivered twice. Whichever arrives first will result in the > hashcash stamp for jm@example.com being considered spent. Whichever arrives > 2nd will get no +ve (or -ve) scoring from the token. Yes, that sounds good. The reason I raised it, is because there was talk of considering a double-spend event something that should be penalized (+ve SA score). This example is a situation where a double-spend could occur without any spammer trickery involved, which would indicate that penalizing double-spends would be a bad idea. > btw you might want something more mnemonic than resource -- eg > hashcash_my_addresses or hashcash_valid_addresses_for_me. Whatever you think. > The "resource" terminology from hashcash is generalised beyond email addresses > to web pages, web servers, IP addresses etc to protect any "resource", and > I've found people find confusing as applied to email. Good idea -- it does seem clear that in the hashcash scheme, "resource" does map to "To address that may deliver to me". That's good news, as it's pretty simple (once the "resource" term is avoided), and will be possibly useful for other systems as well as hashcash. I think the proposed naming (from earlier in this bug's history) of this setting as "valid_to" is probably the easiest for people to understand.
>> The mail will get delivered twice. Whichever arrives first will result in the >> hashcash stamp for jm@example.com being considered spent. Whichever arrives >> 2nd will get no +ve (or -ve) scoring from the token. >Yes, that sounds good. The reason I raised it, is because there was talk of >considering a double-spend event something that should be penalized (+ve SA >score). This example is a situation where a double-spend could occur without >any spammer trickery involved, which would indicate that penalizing >double-spends would be a bad idea. It still makes sitewise hard. If a site has 100 people all subscribed to one mailing list then that single token is going to be very overspent. However as long as the overspend penalty is too much greater than the positive value for the hashcash then it's as if it never existed. For mailing lists hashcash won't provide much benifit but hopefulyl it won't do much harm. Suggested tests with negative scores: describe HASHCASH_MATCHES_VALID hashcash token matches value in valid_to describe HASHCASH_NOT_MATCH_VALID hashcash token Doesn't match any valid_to describe HASHCASH_20 Hashcash match 20 bits or more describe HASHCASH_21 Hashcash match 21 bits or more describe HASHCASH_22 Hashcash match 22 bits or more describe HASHCASH_23 Hashcash match 23 bits or more describe HASHCASH_24 Hashcash match 24 bits or more describe HASHCASH_LOCAL_0 Hashcash not in local dabase describe HASHCASH_LOCAL_1 Hashcash token 1 match in local database describe HASHCASH_LOCAL_2_5 Hashcash token 1-5 matches in local database Suggested tests with positive scores: describe HASHCASH_EXPIRED hashcash token has expired describe HASHCASH_19 Hashcash match 19 bits or less describe HASHCASH_LOCAL_6_10 Hashcash token 6-10 matches in local db describe HASHCASH_LOCAL_10_30 Hashcash token 10-30 matches in local db describe HASHCASH_LOCAL_30_100 Hashcash token 30-100 matches in local db describe HASHCASH_LOCAL_100_PLUS Hashcash token 100+ matches in local db which should just above cover everything. As long as spammers can't use it to get net-negaitive scores then the odd case where a legit user has his scores cancel to zero should be a problem.
> It still makes sitewise hard. If a site has 100 > people all subscribed to one mailing list then > that single token is going to be very overspent. I thought there was a way to do sitewise, I had worked this out in the past as it was one of the deployment approaches discussed on various lists. If I remember here's how it goes: You keep a separate database for each recipient. ie so if two people joe@isp.com and fred@isp.com are both subscribed to some list foo-list@lists.com then you'll key by envelope recipient and store the one for joe in joe.db and the one for fred in fred.db, and everything is happy once more. (Note it doesn't have to be a separate database, but it has to be logically separated in this way so that it won't be considered a double spent token if it's addressed to someone else.) Now of course this is still subject to the limitations that Justin gave which is that neither joe or fred are going to get any +ve (or -ve) score from seeing a mail with a stamp for foo-list unless spamassassin knows they are willing to receive mail as foo-list. I'm not familiar with spamassassin config, so an additional question is: Is it possible with spamassassin in a multi-user environment for a user to have local config adding to the default of $USER@isp.com? That would be pretty neat because then users who care can get positive scoring from their list subscription traffic, or forwarded mail traffic (say joe also is joe@pobox.com and that is forwarded to joe@isp.com). Users who don't understand or care get the default. Say fred also has a local hashcash aware MUA in addition, his could then also add the missed +ve scoring from foo-list back in to exempt from spamassassin -ve spamminess score or adjust the spamassassin score.
OK, it's now in current CVS, and seems to be working; I didn't use Shane's code in the end as I eventually realised how simple the hashcash verification step is -- just take a SHA1 hash and count the bits! Great! 'You keep a separate database for each recipient. ie so if two people joe@isp.com and fred@isp.com are both subscribed to some list foo-list@lists.com then you'll key by envelope recipient and store the one for joe in joe.db and the one for fred in fred.db, and everything is happy once more.' yep, that's what I've done. Each user has their own db, and I've noted in the Conf man page that sharing a db sitewide is a Bad Idea. 'Is it possible with spamassassin in a multi-user environment for a user to have local config adding to the default of $USER@isp.com?' Yes, that'll work -- sysadmin sets up local.cf with hashcash_accept %u@hostname.com and user can then have their own lines in ~/.spamassassin/user_prefs like hashcash_accept *@jmason.org File-style globbing is permitted, and %u expands to the current username (where applicable). BTW currently I have scores for values from 20 to 25 bits; anything over 25 just gets the same score, and anything under 20 gets no bonus. Do these ranges make sense, or should we be using different ranges?