Bug 7096 - Turn off synchronous_commit for PostgreSQL bayes and AWL/TxRep stores.
Summary: Turn off synchronous_commit for PostgreSQL bayes and AWL/TxRep stores.
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: PC Linux
: P2 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-28 10:38 UTC by Tomasz Ostrowski
Modified: 2014-11-17 20:11 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Tomasz Ostrowski 2014-10-28 10:38:34 UTC
Postgres from version 8.3 has feature which would make it much faster as a Bayes store or AWL store in SpamAssassin. It can turn off synchronous commits for non-critical transactions.

This feature will make saving data to a database much faster, say 2 orders of magnitude faster on ordinary hard drives, as commits would not need to wait for fsync. This would make it comparable to NoSQL databases but much safer.

This is safe setting - there's no risk of database corruption. In case of a power failure some transactions from the last 0.6s (3 * default wal_writer_delay setting) can be lost. But it does not matter at all for bayes or AWL database.

Please add something like this to sub _connect_db in lib/Mail/SpamAssassin/BayesStore/PgSQL.pm:

if ( $dbh->{pg_server_version} >= 80300 ) {
  $dbh->do('SET synchronous_commit=off')
}


As a workaround man can set "synchronous_commit=off" in postgresql.conf, but it would affect all Postgres databases. And it is somewhat advanced.

Is there a standard benchmark for bayes stores to run?
Comment 1 Joe Quinn 2014-11-17 20:11:41 UTC
I agree with your code and analysis of data loss potential/impact. I don't know of any benchmarks specific to SA you can run, but you could probably set up a temporary mail server and flood it with garbage as an ad-hoc test, or run any other SQL benchmark.

I don't see anything particularly dangerous about adding it so I am committing your code.

Committed revision 1640221.