Bug 5357

Summary: RFE: split scan process into two parts for efficient multi-recipient handling
Product: Spamassassin Reporter: Justin Mason <jm>
Component: LibrariesAssignee: SpamAssassin Developer Mailing List <dev>
Status: NEW ---    
Severity: enhancement    
Priority: P5    
Version: SVN Trunk (Latest Devel Version)   
Target Milestone: Future   
Hardware: Other   
OS: other   
Whiteboard:

Description Justin Mason 2007-02-27 15:26:15 UTC
another from Mark Martinec:

>Split the process into two parts:
>
>- parsing and munging of mail & rules, resulting in a set of
>  findings (e.g. a list of rules being hit, perhaps somehow
>  generalized). This section can be done once per message,
>  regardless of the number of recipients to the message
>  (assuming all users use the same rules);
>
>- based on the above, score the findings, possibly
>  applying per-recipient scoring to each rule being hit;
>  This (rather inexpensive) step can be applied for each
>  recipient individually, without having to re-process
>  an entire message in multiple-recipient mail.
>
>...and adjust the API to Mail::SpamAssassin accordingly, so that
>MTA-based content filtering (e.g. amavisd-new) could take advantage
>of it, while still allowing full per-recipient customization of
>individual rules scores (including disabling some by a score of 0).
>
>Benefits depend on a site, but our stats show 1.46 recipients
>per message on the average. The above change (when calling SA
>at MTA level) would bring a 46 % increase in througput for free,
>while still providing individualized rules scoring. 

I like the idea -- needs more thought but the general idea
sounds workable.