Bug 5733 - Bayes DBM tie_db_readonly doesn't clear is_locked flag
Summary: Bayes DBM tie_db_readonly doesn't clear is_locked flag
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: 3.2.1
Hardware: Other All
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
Depends on:
Reported: 2007-11-28 14:52 UTC by Rob Mueller
Modified: 2007-11-28 14:52 UTC (History)
0 users

Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Rob Mueller 2007-11-28 14:52:21 UTC
We're using some custom code to call the spamassassin bayes learning 
routines rather than sa-learn, and have come across a slight bug that meant 
the first messages being learnt were always being lost.

Going through the code here's what I have & what's happening...


    $SpamAssassin = Mail::SpamAssassin->new({
      dont_copy_prefs => 1,
      config_text => <<'EOF',
use_bayes 0
bayes_file_mode 0777
bayes_learn_to_journal 1
bayes_auto_expire 1
bayes_journal_max_size 0
bayes_use_hapaxes 0
bayes_auto_learn 0



... later on when I need to learn some messages for a user ...


  # Nasty poke straight into conf object...
  $SpamAssassin->{conf}->{use_bayes} = 1;
  $SpamAssassin->{conf}->{bayes_path} = $LearnDir . "/bayes";
  $SpamAssassin->signal_user_changed({ username => $UnqName });
    caller_will_untie => 1,


... get the messages and loop over ...


    for-all-messages ... {
      my $Message = $SpamAssassin->parse(\@Lines);
      my $Learner = $SpamAssassin->learn($Message, undef, $IsSpam);



The problem occurs because Mail::SpamAssassin::Bayes::learn() has the code:

    if ($self->{main}->{learn_to_journal}) {
      # If we're going to learn to journal, we'll try going r/o first...
      # If that fails for some reason, let's try going r/w.  This happens
      # if the DB doesn't exist yet.
      $ok = $self->{store}->tie_db_readonly() ||
    } else {
      $ok = $self->{store}->tie_db_writable();

If no bayes db exists, then
Mail::SpamAssassin::BayesStore::DBM::tie_db_readonly fails, so it does
Mail::SpamAssassin::BayesStore::DBM::tie_db_writeable() which succeeds.

However when it tries to learn the second message, it calls 
Mail::SpamAssassin::BayesStore::DBM::tie_db_readonly() again. This closes 
the db for writable access, and succeeds in opening it for readonly access, 
however it doesn't clear the $self->{is_locked} flag.

This means that it thinks the db is still in writable mode, even though it's
now been tied in read_only mode.

This means that every message learnt attempts to re-tie the db (slow). It
also means that when we rebuild_learner_caches(), it thinks that the db is
in writeable mode, so when it calls
Mail::SpamAssassin::BayesStore::DBM::tie_db_writeable(), it immediately
returns because it thinks the db is writeable, but then all attempts to
write to the db are just lost (no error messages, the write to the hash just

It seems the fix is as easy as adding $self->{is_locked} = 0 somewhere in

I worked around this by forcibly opening the db in write mode on immediately 
after calling init_learner.

     caller_will_untie => 1,

+  # Need to force creation of db
+  my $Scanner = $SpamAssassin->{bayes_scanner};
+  if ($Scanner && $Scanner->{store}) {
+    $Scanner->{store}->tie_db_writable();
+    $Scanner->{store}->untie_db();
+  }
   return 1;


Hmmm, there was another problem with locks being left around as well, but I 
can't remember the exact details again now. I just know that I found a work 
around by changing my code to:


+  # Call this first to untie db to release lock
+  $SpamAssassin->finish_learner();


I'll see if I can track it down again and remind myself what it was...