Bug 4912 - RFE: Maintain a SpamAssassin corpus of messages
Summary: RFE: Maintain a SpamAssassin corpus of messages
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 4560
  Show dependency tree
 
Reported: 2006-05-26 10:14 UTC by Justin Mason
Modified: 2019-07-08 10:41 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Mason 2006-05-26 10:14:33 UTC
This was a suggested idea for the Google Summer of Code 2006;
I'm adding it to the bugzilla for future use, and in case anyone feels
like implementing it.

Subject ID: spamassassin-corpus
Keywords: corpora, mail, collection, perl, community
Description: Theo said: 'I'd almost rather we shift this around and make a
"SpamAssassin Corpora", have all of us focus on making that the best it can be,
and use that for mass-checks, etc.'  This could be a good possibility. 
Contributors can upload their own mail corpora to a central web app where the
mass-check occurs. The mail collections could be quickly checked for validity,
and tagged based on how much privacy the user wants for their mails (therefore
controlling further redistribution of those mails).  Related to
'spamassassin-easy-mass-check' above.
Possible Mentors: Justin Mason (jm at jmason.org), Theo Van Dinter (felicity-at-
apache.org)
Comment 1 Justin Mason 2006-10-24 10:14:47 UTC
note btw that the zone nowadays includes a large corpus of messages from 3-4
contributors -- it's not public per se, but it is uploaded regularly to the zone.
mass-check results are visible as "bb-foo" at http://ruleqa.spamassassin.org/ .
Comment 2 Henrik Krohns 2019-07-08 10:41:10 UTC
Closing old stale bug. I think there is no point in a centralized corpora / mass check server these days, it's also a can of worms for privacy. Masscheckers do just fine locally.