SA Bugzilla – Bug 6624
BayesStore/MySQL.pm fails to update tokens due to MySQL server bug (wrong count of rows affected)
Last modified: 2011-09-21 00:28:03 UTC
Dave Wreski reported on the users ML on 2011-06-21: > I have an existing v3.3.2 on fedora14 (perl v5.12.3) that I'm trying to > convert bayes to use mysql. The restore process fails after a few > minutes due to too many errors: > dbg: bayes: error inserting token for line: t 1 0 1308114254 4fd2b3f2f0 > dbg: bayes: _put_token: Updated an unexpected number of rows. > bayes: encountered too many errors (20) while parsing token line, > reverting to empty database and exiting > mysql Ver 14.14 Distrib 5.1.56, for redhat-linux-gnu (x86_64) Further discussion and debugging improvements revealed: dbg: bayes: _put_token: Updated an unexpected number of rows: 3, id: 3, token: .... As it turns out this is a MySQL server bug, or at least an undocumented change. Googling shows that others have already stumbled across this bug here INSERT ... ON DUPLICATE KEY UPDATE returns 3 instead of 2 as a rows-changed count. Apparently the bug has not yet been resolved (tried it with MySQL 5.5.13) and seems to be forgotten: http://bugs.mysql.com/bug.php?id=46675 http://dev.mysql.com/doc/refman/5.5/en/mysql-affected-rows.html The relevant piece of information from 46675 is: > [25 Aug 2009 17:57] Paul DuBois > > Aside from the issue noted by Mark that part of the Connector/J doc info > isn't getting into the manual, I think this is actually a server bug. > Left unexplained by any of the preceding discussion is why there should > be a server version difference (5.0 returns 2, 5.1 returns 3). I created > a similar test program using Perl DBI, which has a mysql_client_found_rows > flag that can be enabled or disabled at connect time, and here is what > I find when executing the INSERT ... ON DUPLICATE KEY UPDATE statement > and checking the rows return count. > > mysql_client_found_rows = 0: The second INSERT returns a row count of 2 > in all MySQL versions. > > mysql_client_found_rows = 1: The second INSERT returns this row count: > > Before MySQL 5.1.20: 2 > MySQL 5.1.20: undef on Mac OS X, 139775481 on Linux > (initialized value? garbage?) > MySQL 5.1.21 and up: 3 > > Looking in the 5.1.20 changelog, I see Bug#28505 which concerns > mysql_affected_rows() and CLIENT_FOUND_ROWS. However, this change was > supposed to have been made in both 5.0.44 and 5.1.20, and the change > in row count to return 3 occurs only in 5.1. (I checked 5.0.43, > 5.0.44, 5.0.45 and all of them return 2 rows, expected.) > > It looks to me like something went wrong with the 5.1 fix. I don't know > why there was a change from returning undef/139775481 to returning 3 > between 5.1.20 and 5.1.21. I don't see anything that looks like it's > relevant in the 5.1.21 changelog. The effect of the bug with SpamAssassin is that tokens are only able to be inserted once, but their counts cannot increase, leading to terrible bayes results if the bug is not noticed. Also the conversion form db fails, as reported by Dave. Attached is a patch for lib/Mail/SpamAssassin/BayesStore/MySQL.pm to provide a workaround for the MySQL server bug, and improved debug logging.
trunk: Bug 6624: BayesStore/MySQL.pm fails to update tokens due to MySQL server bug (wrong count of rows affected)' Sending lib/Mail/SpamAssassin/BayesStore/MySQL.pm Committed revision 1138991.
Created attachment 4924 [details] A workaround for MySQL bug, improved debugging The patch is applicable to SA 3.3 as well as the trunk (3.4).
Btw, there is a workaround to the 'affected rows bug' in a form of a flag in the connection: Changing the bayes_sql_dsn (in local.cf) from: bayes_sql_dsn DBI:mysql:sa3;host=127.0.0.1;port=3306 into: bayes_sql_dsn DBI:mysql:sa3;host=127.0.0.1;port=3306;mysql_client_found_rows=0 which avoids this particular count bug problem (but may cause others, as mentioned in the MySQL bug report referenced above and indicated in the DBD::mysql documentation)
closing, fixed for 3.4