Bug 7191 - TxRep uses username from another user
Summary: TxRep uses username from another user
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: 3.4.1
Hardware: Sun Solaris
: P2 normal
Target Milestone: 3.4.3
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-06 11:33 UTC by James Lee
Modified: 2018-08-24 01:47 UTC (History)
10 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
patch to restore performance per comment 18 patch None gessel@blackrosetech.com [NoCLA]
Patch to avoid lock contention patch None Giovanni Bechis [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description James Lee 2015-05-06 11:33:25 UTC
spamd is called with the username set.  The username recorded in the txrep data is sometimes the username of the previous email checked.

The username in the bayes data is correct showing that spamd has been given the correct username.

I inspect the with:

SELECT DISTINCT username, atime FROM bayes_token JOIN bayes_vars ON bayes_token.id = bayes_vars.id WHERE atime > (SELECT MAX(atime) - 3600 FROM bayes_token) ORDER BY atime;

SELECT DISTINCT username, CAST (extract (epoch from last_hit) AS INTEGER) - 3600 AS last_hit FROM txrep WHERE last_hit > (now() - INTERVAL '1 hour') ORDER BY last_hit;

These should give the same usernames for bayes and txrep but sometimes the user is that of the email before.  The dates are slightly offset as the bayes and txrep actions are not coincident (the extra 3600 is I guess a timezone offset).  Not all checks result in bayes learn so some lines are in txrep only.


Alternatively I can check the receipt mail log and compare with the txrep data. This assumes the username given to spamd is correct, (the bayes user name recorded says it is).

SELECT DISTINCT username, last_hit FROM txrep WHERE last_hit > (now() - INTERVAL '1 hour') ORDER BY last_hit;


The emails in question have single recipients and are from different senders so no ambiguity nor overlap of usernames.
Comment 1 James Lee 2015-05-07 10:42:55 UTC
Update: I've analysed some more data and TxRep is getting the username wrong so often I have decided to override it with "user_awl_sql_override_username".

The only thing that makes this bug non critical is that the reputation is of the sender not the recipient so sharing is probably safe, at least sharing all reputations is no worse than randomly sharing.  In the main what is considered spam is based on the sender not the recipient, consider for example that all public DNS lists are shared.  Only if the rules are different per user, (mine are not), or with very different Bayes data will the sender reputation be very different per user.
Comment 2 Stef Simoens 2015-06-07 20:55:10 UTC
I think the wrong assumption is that a connection is only needed "on first run". The object created by SQLBasedAddrList->new_checker() uses the username at connection time.

Because TxRep caches this connection, a next e-mail (via amavis or via spamd) still uses this connection, and a new user is not used.

An option is to uncomment the "return 1 unless (!defined $self->{default_storage});" in TxRep.pm line 1569 (v3.4.1).
Comment 3 Joe Quinn 2015-06-08 13:43:05 UTC
It looks like the line you mention is already uncommented. Did you perhaps mean that the line should be commented?
Comment 4 Andrew 2015-12-02 01:36:58 UTC
Really serious problem, and I approve that this bug is present. 6 months passed after report and no reaction ? is the TxRep project dead ?

I have also found another important bug and found it was reported on Github/TxRep but the same result there - NOTHING. TxRep plugin has serious problems, it is almost UNUSABLE especcially with default features settings. I do not understand Why it was included to the spamassassin package and described in the official docs..
Comment 5 Kevin A. McGrail 2015-12-02 17:08:06 UTC
(In reply to Andrew from comment #4)
> Really serious problem, and I approve that this bug is present. 6 months
> passed after report and no reaction ? is the TxRep project dead ?
> 
> I have also found another important bug and found it was reported on
> Github/TxRep but the same result there - NOTHING. TxRep plugin has serious
> problems, it is almost UNUSABLE especcially with default features settings.
> I do not understand Why it was included to the spamassassin package and
> described in the official docs..

I would also like to get this fixed but please remember we are volunteers and SA is plugin based.  If you find TxRep unsuitable, the source code is there to modify & submit patches.

Plus you can safely not use the plugin.  This is not a sassy response, this is an answer to why we might include something since YMMV.

But looking at the issue, Comment 2 & 3 are important. Can you test commenting this line on your system?

Regards,
KAM
Comment 6 Andrew 2015-12-02 21:43:56 UTC
Thank you for the response, yea sure, I can try it, jsut patched by commenting this line, truncated the txrep db. Some time will be needed until the db is filled with some data to test.

By the way, I have also reported here another problem wit TxRep (https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7269) maybe you can take a look at it or know something about it..

And a little thing about correct user processing to SA: when incoming mail arrives we detect the local user by the RCPT address when calling spamc from MTA. 
And what correct to do when this is outgoing message (that is sent to MTA thourgh SASL AUTH session from our users) ?
Default scripts that described in the official manual do not process this and the user for outgoing mail will not be "OUR" user, it is still RCPT user, that is some foreign address.. so the record of outgoing mail have username and email fields identical. This is correct or not?

I have modified my script a little by providing the SASL AUTH name to it and:
postfix SA calling arg:
argv=/usr/local/bin/mail-scanner.sh ${sasl_username} ${sender} ${recipient}

and the script part:

SASL_USERNAME=$1
shift

SENDER=$1
shift

SPAMC_USER="vmail"

if [ "${SASL_USERNAME}" != "" ]; then
    SPAMC_USER=${SASL_USERNAME}
else
    SPAMC_USER=$1
fi

#... bla bla bla AV part...
#...and

${SPAMC} --u ${SPAMC_USER} -E <$INPUT >$OUTPUT
if [ "$return" = 1 ]; then
    logger -p mail.info "SpamAssassin found spam, but we going on"
elif [ "$return" != 0 ]; then
    logger -s -p mail.warning -t scanner "Temporary SpamAssassin failure $return, delivering too, cause we do not want to loose mail"
    OUTPUT=$INPUT
fi

${SENDMAIL} -oi -f $SENDER "$@" <$OUTPUT

as you see this modified script checks if that was sasl'ed session and if yes - uses the spamc user from sasl login. so after that we have outgoing mail processed by userprefs of local sender.
I think this is correct to make txrep_whitelist_out function working when the database has separate users.. I am right or... ? If yes, I think that it should be described in the manual that for correct processing of outgoing mail there should be a special case to detect the local user from sasl or sender (sasl is most accurate I think).. You can use my script as an example then.. ? and of course This is related not only for TxRep but for any part of SA where the local username is used (userprefs, bayes, etc)
Comment 7 Kevin A. McGrail 2015-12-03 00:34:49 UTC
(In reply to Andrew from comment #6)
> Thank you for the response, yea sure, I can try it, jsut patched by
> commenting this line, truncated the txrep db. Some time will be needed until
> the db is filled with some data to test.

Agreed. We are also looking to see if we can test as well.  

> By the way, I have also reported here another problem wit TxRep
> (https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7269) maybe you can take
> a look at it or know something about it..

Sorry, I don't use postfix/mailscanner nor the txrep_whitelist_out function so I am of very little help.  Better to discuss this on users.
Comment 8 Andrew 2015-12-04 00:51:15 UTC
> Sorry, I don't use postfix/mailscanner nor the txrep_whitelist_out function
> so I am of very little help.  Better to discuss this on users.
In my case postfix is an MTA that sends mail directly to spamc, mailscanner - is just a simple script that, I have rif of amavis or any other layers between mta and spamasssassin.. I thought this is the simplest and most robust way.. how are you using spamassassin ?
Comment 9 Andrew 2015-12-04 19:42:11 UTC
Well, I have examined the db and logs... and it looks like commenting that line fixes the bug with mixing usernames..
Comment 10 Joe Quinn 2015-12-15 18:27:31 UTC
It passes make testing, plus it works in your tests as well so it should be good enough to commit.

Committed revision 1720216.
Committed revision 1720217.
Comment 11 Andrew 2015-12-15 18:51:07 UTC
Thats good, thanks, what bout bugs 7269 and 7270 in TxRep that I reported previously ? I see no reaction about them.
Comment 12 Kevin A. McGrail 2015-12-15 18:54:29 UTC
(In reply to Andrew from comment #11)
> Thats good, thanks, what bout bugs 7269 and 7270 in TxRep that I reported
> previously ? I see no reaction about them.

It's a project of volunteers.  Please keep the comments on the individual bugs.
Comment 13 Andrew 2015-12-15 19:02:23 UTC
(In reply to Kevin A. McGrail from comment #12)
> (In reply to Andrew from comment #11)
> > Thats good, thanks, what bout bugs 7269 and 7270 in TxRep that I reported
> > previously ? I see no reaction about them.
> 
> It's a project of volunteers.  Please keep the comments on the individual
> bugs.

Where is it better to report those bugs related to TxRep ?
Comment 14 Kevin A. McGrail 2015-12-15 19:07:04 UTC
(In reply to Andrew from comment #13)
> (In reply to Kevin A. McGrail from comment #12)
> > (In reply to Andrew from comment #11)
> > > Thats good, thanks, what bout bugs 7269 and 7270 in TxRep that I reported
> > > previously ? I see no reaction about them.
> > 
> > It's a project of volunteers.  Please keep the comments on the individual
> > bugs.
> 
> Where is it better to report those bugs related to TxRep ?

Sorry, I'm not aware of any.  It's a good successor for AWL but needs someone to carry the flag as I haven't been able to reach Ivo in sometime.  He's in Prague.
Comment 15 Andrew 2015-12-15 20:01:38 UTC
Unfortunately just noticed the incorrect username was taken by TxRep again. So the bug still present in some cases.. I have cut the part of the logs about this:

Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
-- this is correct username

Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_generated|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based  3e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_generated|none scores 0, count 0

---- please take a look:
Tue Dec 15 20:27:39 2015 [24139] warn: Use of uninitialized value $msgscore in addition (+) at /usr/lib64/perl5/vendor_perl/5.20.2/Mail/SpamAssassin/Pl
ugin/TxRep.pm line 1415, <GEN363> line 63.
Tue Dec 15 20:27:39 2015 [24139] warn: Use of uninitialized value $msgscore in subtraction (-) at /usr/lib64/perl5/vendor_perl/5.20.2/Mail/SpamAssassin
/Plugin/TxRep.pm line 1415, <GEN363> line 63.
--- Maybe these errors are the cause ?

Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 1.0, delta: 0.000, MSG_ID_U: 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@
sa_generated
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_genera
ted|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_generated|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] warn: Use of uninitialized value $msgscore in addition (+) at /usr/lib64/perl5/vendor_perl/5.20.2/Mail/SpamAssassin/Pl
ugin/TxRep.pm line 1415, <GEN363> line 63.
Tue Dec 15 20:27:39 2015 [24139] warn: Use of uninitialized value $msgscore in subtraction (-) at /usr/lib64/perl5/vendor_perl/5.20.2/Mail/SpamAssassin
/Plugin/TxRep.pm line 1415, <GEN363> line 63.
-- these errors again.


Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 1.0, delta: 0.000, MSG_ID_G: 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@
sa_generated
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: active, 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_generated pre-score: -5.021, autolearn score: -5.02, I
P: 23.253.172.122, address: bugzilla-daemon@remotedomain.com (unsigned)
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
-- correct username again

Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for bugzilla-daemon@remotedomain.com|23.253
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based bugzilla-daemon@remotedomain.com|23.253 scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_IP_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_IP_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_IP_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 10.0, delta: 0.000, EMAIL_IP_U: bugzilla-daemon@remotedomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: anatoly@mydomain.com|bugzilla-daemon@remotedomain.com|23.253|1|-5.02
-- and here is an INCORRECT username "anatoly@" - it is another account, not related to this message..

Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for bugzilla-daemon@remotedomain.com|23.253
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based bugzilla-daemon@remotedomain.com|23.253 scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_IP_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_IP_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_IP_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 10.0, delta: 0.000, EMAIL_IP_G: bugzilla-daemon@remotedomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: GLOBAL|bugzilla-daemon@remotedomain.com|23.253|1
|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for remotedomain.com|23.253
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based remotedomain.com|23.253 scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_DOMAIN_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_DOMAIN_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_DOMAIN_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 2.0, delta: 0.000, DOMAIN_U: remotedomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: anatoly@mydomain.com|remotedomain.com|23.253|1|-5.
02
-- incorrect username again.


Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for remotedomain.com|23.253
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based remotedomain.com|23.253 scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_DOMAIN_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_DOMAIN_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_DOMAIN_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 2.0, delta: 0.000, DOMAIN_G: remotedomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: GLOBAL|remotedomain.com|23.253|1|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for asf-bz1-us-mid.priv.apache.org|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based asf-bz1-us-mid.priv.apache.org|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_HELO_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_HELO_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_HELO_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 0.5, delta: 0.000, HELO_U: asf-bz1-us-mid.priv.apache.org
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: anatoly@mydomain.com|asf-bz1-us-mid.priv.apache.org|none|1|
-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for asf-bz1-us-mid.priv.apache.org|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based asf-bz1-us-mid.priv.apache.org|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_HELO_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_HELO_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_HELO_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 0.5, delta: 0.000, HELO_G: asf-bz1-us-mid.priv.apache.org
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: GLOBAL|asf-bz1-us-mid.priv.apache.org|none|1|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for bugzilla-daemon@remotedomain.com|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based bugzilla-daemon@remotedomain.com|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 3.0, delta: 0.000, EMAIL_U: bugzilla-daemon@remotedomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: anatoly@mydomain.com|bugzilla-daemon@remotedomain.com|none|1|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for bugzilla-daemon@remotedomain.com|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based bugzilla-daemon@remotedomain.com|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_EMAIL_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 3.0, delta: 0.000, EMAIL_G: bugzilla-daemon@remotedomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: GLOBAL|bugzilla-daemon@remotedomain.com|none|1|-
5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for 23.253.172.122|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based 23.253.172.122|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_IP_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_IP_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_IP_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 4.0, delta: 0.000, IP_U: 23.253.172.122
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: anatoly@mydomain.com|23.253.172.122|none|1|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for 23.253.172.122|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based 23.253.172.122|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_IP_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_IP_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_IP_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 4.0, delta: 0.000, IP_G: 23.253.172.122
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: GLOBAL|23.253.172.122|none|1|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: post-TxRep score: -5.021
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based connected to DBI:mysql:sa:localhost:3306
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based using username: andrew@mydomain.com
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_genera
ted|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_generated|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_U is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_U_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_U_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 1.0, delta: 0.000, MSG_ID_U: 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@
sa_generated
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: anatoly@mydomain.com|33e7a82a41a8d2256eddfdcfb7ebecb438e794
3c@sa_generated|none|1|-5.02
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based get_addr_entry: no entries found for 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_genera
ted|none
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_generated|none scores 0, count 0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_G is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_G_COUNT is now ready, value: 0.0
Tue Dec 15 20:27:39 2015 [24139] dbg: check: tagrun - tag TXREP_MSG_ID_G_PRESCORE is now ready, value: -5.0
Tue Dec 15 20:27:39 2015 [24139] dbg: TxRep: reputation: none, count: 0, weight: 1.0, delta: 0.000, MSG_ID_G: 33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@
sa_generated
Tue Dec 15 20:27:39 2015 [24139] dbg: auto-whitelist: sql-based add_score/insert score -5.02: GLOBAL|33e7a82a41a8d2256eddfdcfb7ebecb438e7943c@sa_genera
ted|none|1|-5.02
Comment 16 Kevin A. McGrail 2015-12-15 20:15:07 UTC
Reopening.
Comment 17 Max Kostikov 2016-08-28 21:13:26 UTC
It seems this bug is still in latest Spamassassin distributions.
> root@beta:~ # spamassassin -V
> SpamAssassin version 3.4.1
>  running on Perl version 5.24.1

See my latest cut from SQL tables dump.

> ('doctor@tiksi.ru', 'zhemchuzhina@24mos-seminar.ru', 'none', 1, 4.559, '24mos-seminar.ru'),
> ('doctor@tiksi.ru', '24mos-seminar.ru', 'none', 1, 4.559, '24mos-seminar.ru'),
> ('doctor@tiksi.ru', '89.111.180.199', 'none', 1, 4.559, ''),
> ('doctor@tiksi.ru', 'b863b74433e9249320aa354d2ed43b2001428dc3@sa_generated', 'none', 1, 4.559, '1472417846'),
> ('doctor@tiksi.ru', 'max.kostikov@gmail.com', 'none', 1, -0.818, 'gmail.com'),
> ('doctor@tiksi.ru', 'gmail.com', 'none', 1, -0.818, 'gmail.com'),
> ('doctor@tiksi.ru', 'mail-lf0-f53.google.com', 'none', 1, -0.818, 'helo'),
> ('doctor@tiksi.ru', '209.85.215.53', 'none', 1, -0.818, ''),
> ('doctor@tiksi.ru', '470c7413f25be1cb4be7aba1e5b74d7e4bc6ca9a@sa_generated', 'none', 1, -0.818, '1472417925');

Second message from max.kostikov@gmail.com was addressed to other user not to doctor@tiksi.ru
It need to be fixed asap.
Comment 18 gessel@blackrosetech.com 2017-11-07 15:16:13 UTC
TxRep.pm 1720440 has the following diff relative to FreeBSD distro included 1651114 included, as commented in response to this bug:

1632  # disabled per bug 7191
1633  #return 1 unless (!defined $self->{default_storage});

Relative to 1651114, distributed with freebsd, there are three meaningful code changes: this one, a typo correction ("SQLasedAddrList"), and fixes for #7164.  

Unfortunately, 1720440 also introduce a massive performance penalty.  I tried tracking it down by running NYTProf with 1651114, 1720440, and 1720440 with line 1633 uncommented.  

my results are (arbitrary test data set, absolute numbers meaningless)
1651114 732 seconds (3.1 sec/message) 
http://www.blackrosetech.com/insecure/nytprof-txrep-1651114/

1720440 96,451 seconds (409 sec/message)
http://www.blackrosetech.com/insecure/nytprof-txrep-1720440/

1720440-L1633m 1592 seconds (6.7 sec/message)
http://www.blackrosetech.com/insecure/nytprof-txrep-1720440-L1633m/

The following errors were reported during execution (these errors did not happen with the other versions of TxRep.pm), but the bayes database was properly updated

bayes: bayes db version 0 is not able to be used, aborting! at /usr/local/lib/perl5/site_perl/Mail/SpamAssassin/BayesStore/DBM.pm line 208.
locker: safe_unlock: lock on /var/amavis/.spamassassin/bayes.lock was lost due to expiry at /usr/local/lib/perl5/site_perl/Mail/SpamAssassin/Locker/UnixNFSSafe.pm line 219.
Learned tokens from 235 message(s) (235 message(s) examined)
locker: error accessing /var/amavis/.spamassassin/bayes.lock: No such file or directory at /usr/local/lib/perl5/site_perl/Mail/SpamAssassin/Locker/UnixNFSSafe.pm line 190.
locker: safe_unlock: lock on /var/amavis/.spamassassin/bayes.lock was lost due to expiry at /usr/local/lib/perl5/site_perl/Mail/SpamAssassin/Locker/UnixNFSSafe.pm line 219.
Comment 19 gessel@blackrosetech.com 2017-12-19 16:46:18 UTC
Created attachment 5501 [details]
patch to restore performance per comment 18

1720440 solves the "Use of uninitialized value $msgscore"..."TxRep.pm line 1414" error relative to 1651114, distributed with freeBSD.  However, it introduces a performance penalty of 132x (13,200%) slow down.  This patch results in a 61x speedup relative to 1720440, or a net penalty of 2.2x but without all the errors in 1651114.
Comment 20 gessel@blackrosetech.com 2017-12-19 16:47:22 UTC
submitted a patch to 1720440. 1720440 solves the "Use of uninitialized value $msgscore"..."TxRep.pm line 1414" error (bug #1763) relative to 1651114, distributed with freeBSD.  However, it introduces a performance penalty of 132x (13,200%) slow down.  This patch results in a 61x speedup relative to 1720440, or a net penalty of 2.2x but without all the errors in 1651114.
Comment 21 Kevin A. McGrail 2018-05-06 16:39:46 UTC
Giovanni, think you can take a look at this bug and Gessel's patch?  It's a one liner so it's trivial and the CLA shouldn't be a blocker.  The patch makes sense to me.
Comment 22 Giovanni Bechis 2018-05-08 07:04:06 UTC
(In reply to Kevin A. McGrail from comment #21)
> Giovanni, think you can take a look at this bug and Gessel's patch?  It's a
> one liner so it's trivial and the CLA shouldn't be a blocker.  The patch
> makes sense to me.

I think that uncommenting that line will reopen the original bug, see comment #2.
Comment 23 gessel@blackrosetech.com 2018-05-08 08:42:12 UTC
I posted a bit more detail of my attempt at understanding and fixing at: https://www.mail-archive.com/users@spamassassin.apache.org/msg100171.html

I'm running with this patch, and not seeing any issues, but I don't have enough users to do any authoritative testing.
Comment 24 Giovanni Bechis 2018-06-09 12:51:02 UTC
Does the patch https://bz.apache.org/SpamAssassin/attachment.cgi?id=5572&action=view (bz 7587) fixes the performance problem as well ?
Comment 25 gessel@blackrosetech.com 2018-06-09 16:50:35 UTC
Giovanni,

I'm happy to test the patch on FreeBSD and report anything I can find, but the repository doesn't show revision 1833089, the most recent is Revision 1824686.

-David
Comment 26 Giovanni Bechis 2018-06-10 20:46:08 UTC
Created attachment 5573 [details]
Patch to avoid lock contention

Patch taken from bz 7587
Comment 27 Giovanni Bechis 2018-06-10 20:47:52 UTC
(In reply to gessel@blackrosetech.com from comment #25)
> Giovanni,
> 
> I'm happy to test the patch on FreeBSD and report anything I can find, but
> the repository doesn't show revision 1833089, the most recent is Revision
> 1824686.
> 
> -David

Latest revision is r1833297, maybe your tree is out of date ?
Anyway I added the same patch that can be applied to a current src tree.
Comment 28 Kevin A. McGrail 2018-08-24 01:47:47 UTC
Please let us know if the issue remains with the current fixes in trunk and 3.4