SA Bugzilla – Bug 4635
Bayes fails to initialize MySQL and PgSQL database
Last modified: 2022-03-06 16:31:48 UTC
Running Bayes with a MySQL or PgSQL database seems to cause a failure where a non-existing user nevers get their database table row in bayes_vars initialized ("bayes: unable to initialize database [...]" in SpamAssassin/BayesStore/SQL.pm). This seems to be originating from SpamAssassin/Bayes.pm line 1122, which tries to establish a read-only connection, which does fail if it can't find the user from the database table bayes_vars. (This same error occurs also at least in spamd and sa-learn and supposedly everywhere else where SpamAssassin/Bayes.pm function is_scan_available() is used.) The following is a quick fix for the problem. I must stress that I do not know what other effects it has. It creates the user, if they do not exist in the database table yet. Change the line on SpamAssassin/Bayes.pm:1122 (in is_scan_available) as follows: out: return 0 unless $self->{store}->tie_db_readonly(); in: return 0 unless ( $self->{store}->tie_db_readonly() || $self->{store}- >tie_db_writable() );
I looked into this for another problem (as it turned out, completely unrelated, and my own fault), and I don't think this is the proper change. Basically, (as I understand it) is_scan_available is meant to return 0 at that point to indicate that no bayes scan is available - because there is no stored bayes data for that user. This is The Right Thing - with no bayes data, no bayes scan is possible. Even *with* your change, the subroutine will still return 0 once it gets down to the min_spam/min_ham checks further in the subroutine. When learning is enabled and SpamAssassin decides to learn from the message, it doesn't check is_scan_available. Instead, it will call tie_db_writable(), which creates the user if needed, then inserts the data. That said, I believe there *is* a problem here - if $self->_initialize_db(0) (at Mail/SpamAssassin/BayesStore/SQL.pm, line 140) fails, ->untie_db() will be called, which proceeds to call $self->{_dbh}->disconnect (the comment right above this function is entirely misleading, if not completely wrong) - but that means that if learning happens, BayesStore/SQL.pm has now connected to the database twice instead of simply reusing the same connection (or potentially much more than twice, if BayesStore is used multiple times without an ->untie_db()). That problem can, I believe, be fixed by changing the return value here from 0 to -1 (BayesStore/SQL.pm, lines 1772-1773): # Do not create an entry for this user unless we were specifically asked to return 0 unless ($create_entry_p); then, in tie_db_readonly, changing (lines 140-144) from: unless ($self->_initialize_db(0)) { dbg("bayes: unable to initialize database for ".$self->{_username}." user, aborting!"); $self->untie_db(); return 0; } to: unless (my $init = $self->_initialize_db(0)) { dbg("bayes: unable to initialize database for ".$self->{_username}." user, aborting!"); $self->untie_db(); return 0; } elsif ($init == -1) { dbg("bayes: $self->{_username} has no data in database, aborting!"); return 0; } I'm not intimately familiar with all the innards of spam assassin, so hopefully someone more experienced with it than I can tell me whether my understanding of how this code works is correct or not.
Created attachment 3195 [details] Patch to not disconnect unnecessarily Attached a patch to Mail/SpamAssassin/BayesStore/SQL.pm containing the changes in comment 1.
Subject: Re: Bayes fails to initialize MySQL and PgSQL database >------- Additional Comments From jagerman@jagerman.com 2005-10-18 19:26 ------- > >That said, I believe there *is* a problem here - if $self->_initialize_db(0) (at >Mail/SpamAssassin/BayesStore/SQL.pm, line 140) fails, ->untie_db() will be >called, which proceeds to call $self->{_dbh}->disconnect (the comment right >above this function is entirely misleading, if not completely wrong) - but that >means that if learning happens, BayesStore/SQL.pm has now connected to the >database twice instead of simply reusing the same connection (or potentially >much more than twice, if BayesStore is used multiple times without an ->untie_db()). > > > I believe you are mistaken in how the code works. That said, it's been awhile so it's not coming to me immediately. I'll have a look and see just to make sure. Michael
Created attachment 3205 [details] Updated patch to avoid double-connect (In reply to comment #3) > > I believe you are mistaken in how the code works. That said, it's been > awhile so it's not coming to me immediately. I'll have a look and see > just to make sure. > I've verified that if tie_db_readonly() is called, followed by tie_db_writable(), and the user has no bayes information yet, two seperate database connections are performed: once for _initialize_db(0), which disconnects as soon as _initialize_db(0) "fails", then again when tie_db_writeable() is called - this is easy enough to test by setting $ENV{DBI_TRACE} = 1 before SA is loaded, and looking for ->connect's in the resulting trace debugging. My previous patch was broken, however, in that it needed to call _initialize_db(1) if $self->{db_writable_p} is false just before it sets it to true in tie_db_writable().
Created attachment 3206 [details] Previous patch, minus debugging line
The latest patch works for me on 3.1.3. The first few messages getting through logged an object unassigned error (sorry, didn't scribble it down), but after the transient all is fine. Angelo.
Created attachment 3651 [details] Unofficial patch to Bayes _initialize_db+_tie_db_... As I can see from a previous patch, the problem is not resolved. This patch is my suggestion on that issue. It trys to follow a *Description* of patched methods. I tested it with 3.1.1/perl 5.8.8
Created attachment 3652 [details] Unofficial patch to Bayes _initialize_db+MySQL.pm It's the same as a SQL.pm patch, but for MySQL.pm module.
-1 on the patches. There is a reason for not creating the entries.
(In reply to comment #9) > -1 on the patches. There is a reason for not creating the entries. IMHO the only place where we arise in a problem with create an user entry is the 'clear_database()', which trash dbase if user just inserted. I think it can be resolved with a $self->{some_var} private to SQL class and set/test it when we need db_readonly from some place.
Closing ancient stale bug.