Bug 4642 - concurrency problem in the PostgreSQL specific Bayes-module
Summary: concurrency problem in the PostgreSQL specific Bayes-module
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: 3.1.0
Hardware: Other other
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-20 19:42 UTC by Stefan Kaltenbrunner
Modified: 2016-01-26 17:50 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Kaltenbrunner 2005-10-20 19:42:55 UTC
the PostgreSQL specific Bayes-module uses the plpgsql functions _put_token()
that takes an array of bytea-values and loops through the array doing either an
update or an insert.
However the approach used is subject to a racecondition in the case of multiple
clients learning mails with similiar tokens (multiple spamds or sa-learns),
because it might happen that a new token gets inserted(and commited) after the
function got called but before it tries the insert resulting in the following
error message in the postgresql log:

ERROR:  duplicate key violates unique constraint "bayes_token_pkey"
CONTEXT:  SQL statement "INSERT INTO bayes_token (id, token, spam_count,
ham_count, atime) VALUES ( $1 ,  $2 ,  $3 ,  $4 ,  $5 )"
        PL/pgSQL function "put_tokens" line 18 at SQL statement

beginning with PostgreSQL 8.0 there is support of subtransactions/exceptions in
postgresql/plpgsql that could be used to solve that problem - an example for
that is available on:
http://developer.postgresql.org/docs/postgres/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING
(example 36-1)
Comment 1 Mark Martinec 2009-10-15 11:15:58 UTC
> However the approach used is subject to a racecondition in the case of multiple
> clients learning mails with similiar tokens (multiple spamds or sa-learns),
> because it might happen that a new token gets inserted(and commited) after the
> function got called but before it tries the insert resulting in the following
> error message in the postgresql log:

Btw, the same problem occurs in SQL-based AWL backend: process A does a SELECT
and does not find any entries, meanwhile process B does the same and INSERTs
its record, and when later process A tries to do its own INSERT (instead of
UPDATE), the SQL operation fails due to a key constraint.
Comment 2 Daniel J. Luke 2016-01-26 17:50:12 UTC
PostgreSQL 9.5 support upsert (see also bug 7218 which includes a patch for postgres AWL)