|
SA Bugzilla – Full Text Bug Listing |
Summary: | Bayes-SQL improvements | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Thorsten Meinl <Thorsten> |
Component: | Learner | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | NEW --- | ||
Severity: | enhancement | ||
Priority: | P5 | ||
Version: | 3.2.4 | ||
Target Milestone: | Undefined | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
Attachments: | Patch for splitting the bayes_token table |
This is something best done in a new BayesStore module instead of patching the existing modules. > This is something best done in a new BayesStore module instead of patching the
> existing modules.
agreed; ideally it should be possible to subclass the existing Bayes plugin, or hook into it in some similar way, to reuse as much of that code as possible.
|
Created attachment 4410 [details] Patch for splitting the bayes_token table All bayes tokens for all user are currently stored inside one huge table (if Bayes is stored inside an SQL database). For several thousand users this becomes a bottleneck, especially for bayes_expire. The patch below adds the possibility to split the token table into several tables. Which user is contained in which table is looked up from bayes_vars which has an additional column "token_table". New user are automatically assigned to one table by using their name's CRC32 checksum (could have been any other but this one was easiest as it gives an int which can be used to derive a simple number for the token table). This patch leads to considerably lower loads on our machine and bayes_expire now only takes about 5 hours instead of 20 before when using 10 instead of 1 table.