Bug 6624 - BayesStore/MySQL.pm fails to update tokens due to MySQL server bug (wrong count of rows affected)
BayesStore/MySQL.pm fails to update tokens due to MySQL server bug (wrong cou...
Status: RESOLVED FIXED
Product: Spamassassin
Classification: Unclassified
Component: Libraries
3.3.2
All All
: P2 major
: 3.4.0
Assigned To: SpamAssassin Developer Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2011-06-23 17:00 UTC by Mark Martinec
Modified: 2011-09-21 00:28 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
A workaround for MySQL bug, improved debugging patch None Mark Martinec [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Martinec 2011-06-23 17:00:17 UTC
Dave Wreski reported on the users ML on 2011-06-21:

> I have an existing v3.3.2 on fedora14 (perl v5.12.3) that I'm trying to 
> convert bayes to use mysql. The restore process fails after a few 
> minutes due to too many errors:
>   dbg: bayes: error inserting token for line: t 1 0 1308114254 4fd2b3f2f0
>   dbg: bayes: _put_token: Updated an unexpected number of rows.
>   bayes: encountered too many errors (20) while parsing token line, 
>     reverting to empty database and exiting
> mysql  Ver 14.14 Distrib 5.1.56, for redhat-linux-gnu (x86_64)

Further discussion and debugging improvements revealed:
  dbg: bayes: _put_token: Updated an unexpected number of rows: 3,
    id: 3, token: ....

As it turns out this is a MySQL server bug, or at least an
undocumented change. Googling shows that others have already stumbled
across this bug here INSERT ... ON DUPLICATE KEY UPDATE returns 3
instead of 2 as a rows-changed count. Apparently the bug has not yet
been resolved (tried it with MySQL 5.5.13) and seems to be forgotten:

  http://bugs.mysql.com/bug.php?id=46675
  http://dev.mysql.com/doc/refman/5.5/en/mysql-affected-rows.html

The relevant piece of information from 46675 is:

> [25 Aug 2009 17:57] Paul DuBois
> 
> Aside from the issue noted by Mark that part of the Connector/J doc info
> isn't getting into the manual, I think this is actually a server bug.
> Left unexplained by any of the preceding discussion is why there should
> be a server version difference (5.0 returns 2, 5.1 returns 3). I created
> a similar test program using Perl DBI, which has a mysql_client_found_rows
> flag that can be enabled or disabled at connect time, and here is what
> I find when executing the INSERT ... ON DUPLICATE KEY UPDATE statement
> and checking the rows return count.
> 
> mysql_client_found_rows = 0: The second INSERT returns a row count of 2
> in all MySQL versions.
> 
> mysql_client_found_rows = 1: The second INSERT returns this row count:
> 
> Before MySQL 5.1.20: 2
> MySQL 5.1.20: undef on Mac OS X, 139775481 on Linux
>   (initialized value? garbage?)
> MySQL 5.1.21 and up: 3
> 
> Looking in the 5.1.20 changelog, I see Bug#28505 which concerns
> mysql_affected_rows() and CLIENT_FOUND_ROWS. However, this change was
> supposed to have been made in both 5.0.44 and 5.1.20, and the change
> in row count to return 3 occurs only in 5.1. (I checked 5.0.43,
> 5.0.44, 5.0.45 and all of them return 2 rows, expected.)
> 
> It looks to me like something went wrong with the 5.1 fix. I don't know
> why there was a change from returning undef/139775481 to returning 3
> between 5.1.20 and 5.1.21. I don't see anything that looks like it's
> relevant in the 5.1.21 changelog.


The effect of the bug with SpamAssassin is that tokens are only able
to be inserted once, but their counts cannot increase, leading to
terrible bayes results if the bug is not noticed. Also the conversion
form db fails, as reported by Dave.

Attached is a patch for lib/Mail/SpamAssassin/BayesStore/MySQL.pm to
provide a workaround for the MySQL server bug, and improved debug logging.
Comment 1 Mark Martinec 2011-06-23 17:04:03 UTC
trunk:

  Bug 6624: BayesStore/MySQL.pm fails to update tokens due to
    MySQL server bug (wrong count of rows affected)'
  Sending lib/Mail/SpamAssassin/BayesStore/MySQL.pm
Committed revision 1138991.
Comment 2 Mark Martinec 2011-06-23 17:04:26 UTC
Created attachment 4924 [details]
A workaround for MySQL bug, improved debugging

The patch is applicable to SA 3.3 as well as the trunk (3.4).
Comment 3 Mark Martinec 2011-06-23 18:08:31 UTC
Btw, there is a workaround to the 'affected rows bug' in a form of
a flag in the connection:

Changing the bayes_sql_dsn (in local.cf) from:

bayes_sql_dsn DBI:mysql:sa3;host=127.0.0.1;port=3306

into:

bayes_sql_dsn DBI:mysql:sa3;host=127.0.0.1;port=3306;mysql_client_found_rows=0

which avoids this particular count bug problem (but may cause
others, as mentioned in the MySQL bug report referenced above
and indicated in the DBD::mysql documentation)
Comment 4 Mark Martinec 2011-09-21 00:28:03 UTC
closing, fixed for 3.4