SA Bugzilla – Bug 3079
BayesStoreSQL.pm loses connection while processing second message
Last modified: 2004-02-26 05:21:36 UTC
When using Postgres to store bayes data, it works correctly processing the first submitted message. However, all subsequent messages give the following error: debug: ---- MIME PARSER END ---- DBD::Pg::st execute failed: server closed the connection unexpectedly at /usr/local/lib/perl5/site_perl/5.6.1/Mail/SpamAssassin/BayesStoreSQL.pm line 996, <GEN421> line 73. debug: bayes: _get_db_version: SQL Error 2: server closed the connection unexpectedly debug: bayes: found bayes db version 0 This happens whether or not SQL is used for userprefs/awls. Tested on FreeBSD 5.1-RELEASE, Postgres 7.3.2, perl 5.6.1, DBD::Pg 1.2.1, with latest latest CVS version of 2.70 My perl is too sketchy to track down where the connection is being lost, but will happily try any suggestions.
This appears to be a problem with DBD, or how we use DBD. I tested this on a few different setups, and it happens with mysql also. It works fine with the following setup: FreeBSD 5.1-RELEASE, perl 5.6.1, MySQL 4.1.0-alpha, DBD::mysql 2.1026, DBI-1.37 Same machine using postgres does not work however: FreeBSD 5.1-RELEASE, perl 5.6.1, Postgres 7.3.2, DBD::Pg 1.2.1, DBI 1.37 Tested on another machine, and it does not work with either setup: FreeBSD 4.8-STABLE, perl 5.8.0, Postgres 7.3.3, DBD::Pg 1.22, DBI 1.37 FreeBSD 4.8-STABLE, perl 5.8.0, Mysql 4.1.0-alpha, DBD::mysql 2.9003, DBI-1.37 I downgraded DBD-mysql on this machine to 2.1026 and it started working, so if somebody can figure out what is done different between the two, that should highlight the cause. I tested the latest version of DBD::Pg (1.31) but the problem still occurs.
Subject: Re: BayesStoreSQL.pm looses connection while processing second message Can you please attach the output from perl -V Thanks Michael
Okay, I found a workaround for mysql, but still no solution for postgres. I did a little further testing with different versions of DBD::mysql. It works with a number of older versions of DBD::mysql, up to version 2.1026. It stops working with the next released version, 2.9002. According to the DBD::mysql CHANGELOG, the following relevant changes were made: 2003-06-22 Rudy Lippan <rlippan@remotelinux.com> (2.9002) * Added attribute 'mysql_auto_reconnect' that allows the auto reconnect behaviour to be toggled. : ** NOTE** The behaviour of auto reconnect has changed. If either the MOD_PERL or the GATEWAY_INTERFACE environment variable is set, auto_reconnect will default to ON; otherwise auto_reconnect will default to off. Earlier versions of this driver would always try to reconnect to the database on error; however, this is dangerous because table locks could be lost without the application knowing. * Fixed a segfault with failed reconnects that were trapped in an eval. The next tine DBD::mysql tried to reconnect, the process would segfault. * Added statistics attribute, 'mysql_dbd_stats' which returns a hash ref that contains 2 keys 'auto_reconnects_ok' and 'auto_reconnects_failed'. Sure enough, I set the env variable GATEWAY_INTERFACE=1, and it works with all versions of DBD::mysql. It does however not fix the problem with DBD::Pg, a brief look at the documentation suggests that DBD::Pg does not have an auto_reconnect capability. I guess spamd is loosing the connection after processing the first message, and is not explicitly opening it again. This then works with DBD::mysql as it auto-reconnects, but fails with DBD::Pg. I had a bash at fixing BayesStoreSQL.pm but with little success - My perl skills are quite limited. However I did compare it with ConfSourceSQl.pm and SQLBasedAddrList.pm to see what is done differently. First, there is no disconnect() call made. I tried adding it, but I couldn't figure out where it should/shouldn't be called from. sth->finish() is called in a number of places, which I imagine is meant to do the job, but I couldn't figure out exactly where that is defined. I expected to find it declared in BayesStoreSQL.pm itself, but it isn't, so my understanding of perl methods must be incomplete. I did try adding such a sub, but it had no impact. Secondly, prepare_cached() is used in place of prepare(), again, my knowledge is too limited to know what impact that has. Just for fun, I replaced them all with prepare(), but saw no difference. At this point, I'm out of steam, hopefully there's enough information here for somebody with knowledge of DBD to fix it.
Output of perl -V from first testing machine (FreeBSD 5.1-RELEASE): ttyp1> perl -V Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration: Platform: osname=freebsd, osvers=5.1-beta, archname=i386-freebsd uname='freebsd freebsd.org 5.1-beta freebsd 5.1-beta #0: fri may 23 07:41:04 pdt 2003 kris@freebsd.org:usrsrcsysmagickernelpath i386 ' config_args='-sde -Dprefix=/usr/local -Darchlib=/usr/local/lib/perl5/5.6.1/mach -Dprivlib=/usr/local/lib/perl5/5.6.1 -Dman3dir=/usr/local/lib/perl5/5.6.1/man/man3 -Dsitearch=/usr/local/lib/perl5/site_perl/5.6.1/mach -Dsitelib=/usr/local/lib/perl5/site_perl/5.6.1 -Dscriptdir=/usr/local/bin -Dcc=cc -Ui_malloc -Ui_iconv -Dccflags=-DAPPLLIB_EXP="/usr/local/lib/perl5/5.6.1/BSDPAN" -Ui_gdbm -Dusemymalloc=n' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef Compiler: cc='cc', ccflags ='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.6.1/BSDPAN" -fno-strict-aliasing -I/usr/local/include', optimize='-O -pipe -mcpu=pentiumpro', cppflags='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.6.1/BSDPAN" -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='3.2.2 [FreeBSD] 20030205 (release)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, usemymalloc=n, prototype=define Linker and Libraries: ld='cc', ldflags ='-Wl,-E -L/usr/local/lib' libpth=/usr/lib /usr/local/lib libs=-lm -lc -lcrypt -lutil perllibs=-lm -lc -lcrypt -lutil libc=, so=so, useshrplib=false, libperl=libperl.a Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' ' cccdlflags='-DPIC -fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: USE_LARGE_FILES Built under freebsd Compiled at May 23 2003 14:53:39 @INC: /usr/local/lib/perl5/site_perl/5.6.1/mach /usr/local/lib/perl5/site_perl/5.6.1 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.6.1/BSDPAN /usr/local/lib/perl5/5.6.1/mach /usr/local/lib/perl5/5.6.1 . And from the second machine (FreeBSD 4.8-STABLE) ttyp2> perl -V Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration: Platform: osname=freebsd, osvers=4.8-stable, archname=i386-freebsd uname='freebsd saruman.lorien.fbi.ie 4.8-stable freebsd 4.8-stable #2: tue aug 12 10:44:14 ist 2003 root@saruman.lorien.fbi.ie:usrobjusrsrcsyssaruman i386 ' config_args='-sde -Dprefix=/usr/local -Darchlib=/usr/local/lib/perl5/5.8.0/mach -Dprivlib=/usr/local/lib/perl5/5.8.0 -Dman3dir=/usr/local/lib/perl5/5.8.0/man/man3 -Dsitearch=/usr/local/lib/perl5/site_perl/5.8.0/mach -Dsitelib=/usr/local/lib/perl5/site_perl/5.8.0 -Dscriptdir=/usr/local/bin -Ui_malloc -Ui_iconv -Uinstallusrbinperl -Dcc=cc -Dccflags=-DAPPLLIB_EXP="/usr/local/lib/perl5/5.8.0/BSDPAN" -Ui_gdbm -Dusemymalloc=n -Dusethreads=n' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.8.0/BSDPAN" -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -I/usr/local/include', optimize='-O -pipe ', cppflags='-DAPPLLIB_EXP="/usr/local/lib/perl5/5.8.0/BSDPAN" -DHAS_FPSETMASK -DHAS_FLOATINGPOINT_H -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='2.95.4 20020320 [FreeBSD]', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags ='-Wl,-E -L/usr/local/lib' libpth=/usr/lib /usr/local/lib libs=-lm -lc -lcrypt -lutil perllibs=-lm -lc -lcrypt -lutil libc=, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' ' cccdlflags='-DPIC -fPIC', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: USE_LARGE_FILES Built under freebsd Compiled at Aug 13 2003 23:37:30 @INC: /usr/local/lib/perl5/site_perl/5.8.0/mach /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.8.0/BSDPAN /usr/local/lib/perl5/5.8.0/mach /usr/local/lib/perl5/5.8.0 .
Subject: Re: BayesStoreSQL.pm looses connection while processing second message On Wed, Feb 25, 2004 at 08:50:16AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote: > > At this point, I'm out of steam, hopefully there's enough information here for somebody with knowledge of DBD to fix it. > Just so you know, you're not the only person to encounter this error. I've never been able to re-produce it and we tried lots of things (ie disconnect/prepare etc etc) and the only thing that made it go away was re-install of perl, I think without threads. Possibly something in perl 5.6.1 is triggering it since I've never seen it in perl 5.8.x. I'm gonna build a box with 5.6.1 and see if I can't re-create, once I do I'm confident I can find a fix. Michael
Changed summary to fix a spelling error that is like fingernails on a chalkboard to my brain :-)
I've been able to recreate this now. It has something to do with either -x or -q or a combination of both.
Turns out there are cases where signal_user_changed is not called when spamd is run with -x. When that happens the database connection that is being used is from before the fork, which is BAD.
Should be fixed in r6887.