Bug 5357

Summary:	RFE: split scan process into two parts for efficient multi-recipient handling
Product:	Spamassassin	Reporter:	Justin Mason <jm>
Component:	Libraries	Assignee:	SpamAssassin Developer Mailing List <dev>
Status:	NEW ---
Severity:	enhancement
Priority:	P5
Version:	SVN Trunk (Latest Devel Version)
Target Milestone:	Future
Hardware:	Other
OS:	other
Whiteboard:

Description Justin Mason 2007-02-27 15:26:15 UTC

another from Mark Martinec:

>Split the process into two parts:
>
>- parsing and munging of mail & rules, resulting in a set of
>  findings (e.g. a list of rules being hit, perhaps somehow
>  generalized). This section can be done once per message,
>  regardless of the number of recipients to the message
>  (assuming all users use the same rules);
>
>- based on the above, score the findings, possibly
>  applying per-recipient scoring to each rule being hit;
>  This (rather inexpensive) step can be applied for each
>  recipient individually, without having to re-process
>  an entire message in multiple-recipient mail.
>
>...and adjust the API to Mail::SpamAssassin accordingly, so that
>MTA-based content filtering (e.g. amavisd-new) could take advantage
>of it, while still allowing full per-recipient customization of
>individual rules scores (including disabling some by a score of 0).
>
>Benefits depend on a site, but our stats show 1.46 recipients
>per message on the average. The above change (when calling SA
>at MTA level) would bring a 46 % increase in througput for free,
>while still providing individualized rules scoring. 

I like the idea -- needs more thought but the general idea
sounds workable.