Bug 8072 - AWL incorrectly resets message count
Summary: AWL incorrectly resets message count
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: 3.4.6
Hardware: PC Linux
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-09 22:37 UTC by Matija Nalis
Modified: 2022-11-10 00:39 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Matija Nalis 2022-11-09 22:37:00 UTC
I've had situations when some legit messages suddenly have spam score increase of several hundreds after one email. Trying to debug, I've tracked it down to AWL module (using mysql backend).

First looking at the mail log, it seems AWL is changing minimally as expected, and then suddenly it jumps HUGELY (and than trail off slowly again):
AWL=0.016
AWL=0.016
AWL=0.016
AWL=-0.085
AWL=0.016
AWL=0.937
AWL=1.406
AWL=1.627
AWL=318.182
AWL=265.033
AWL=176.606
[...]

sifting through mysqlbinlog to find what happened I find this piece of evidence:

/*!*/;
# at 585992025
# at 585992247
#221104  9:54:16 server id 1  end_log_pos 585992247 CRC32 0x3d2066fc    Annotate_rows:
#Q> UPDATE awl SET msgcount = '5', totscore = totscore + '-0.459' WHERE username = 'amavis' AND email = 'redacted@example.com' AND signedby = 'example-com.20210112.gappssmtp.com' AND i>
#221104  9:54:16 server id 1  end_log_pos 585992317 CRC32 0x3463013b    Table_map: `spamassassin`.`awl` mapped to number 22
# at 585992317
#221104  9:54:16 server id 1  end_log_pos 585992531 CRC32 0xd7fcc32a    Update_rows: table id 22 flags: STMT_END_F
### UPDATE `spamassassin`.`awl`
### WHERE
###   @1='amavis'
###   @2='redacted@example.com'
###   @3='213.191'
###   @4=2392
###   @5=3189.25
###   @6='example-com.20210112.gappssmtp.com'
###   @7=1667550625
### SET
###   @1='amavis'
###   @2='redacted@example.com'
###   @3='213.191'
###   @4=5
###   @5=3188.79
###   @6='example-com.20210112.gappssmtp.com'
###   @7=1667552056
# Number of rows: 1
# at 585992531
#221104  9:54:16 server id 1  end_log_pos 585992562 CRC32 0xff0bfeb5    Xid = 8567075
COMMIT/*!*/;

So it seems that while totscore is correctly increased, message count gets reset  -- instead of increasing from 2392 to 2393, it gets reset to 5 -- which obviously leads to all further uses to get hugely increased AWL!

That is  spamassassin 3.4.6-1 from Debian Bullseye.
Comment 1 Matija Nalis 2022-11-09 22:46:01 UTC
Some 20 minutes earlier, I find possibly related AWL update - for same email address, but different IP range, the msgcount was increased from 3 to 4 (and the above obviously buggy change sets msgcount to 5:

/*!*/;
# at 579668809
# at 579669029
#221104  9:34:03 server id 1  end_log_pos 579669029 CRC32 0xff91a46d    Annotate_rows:
#Q> UPDATE awl SET msgcount = '4', totscore = totscore + '1.39' WHERE username = 'amavis' AND email = 'redacted@example.com' AND signedby = 'example-com.20210112.gappssmtp.com' AND ip >
#221104  9:34:03 server id 1  end_log_pos 579669099 CRC32 0xa7d7c1f0    Table_map: `spamassassin`.`awl` mapped to number 22
# at 579669099
#221104  9:34:03 server id 1  end_log_pos 579669313 CRC32 0x460a91e9    Update_rows: table id 22 flags: STMT_END_F
### UPDATE `spamassassin`.`awl`
### WHERE
###   @1='amavis'
###   @2='redacted@example.com'
###   @3='195.190'
###   @4=3
###   @5=9.794
###   @6='example-com.20210112.gappssmtp.com'
###   @7=1667550842
### SET
###   @1='amavis'
###   @2='redacted@example.com'
###   @3='195.190'
###   @4=4
###   @5=11.184
###   @6='example-com.20210112.gappssmtp.com'
###   @7=1667550843
# Number of rows: 1
# at 579669313
#221104  9:34:03 server id 1  end_log_pos 579669344 CRC32 0xb73804b3    Xid = 8550665
COMMIT/*!*/;
Comment 2 Matija Nalis 2022-11-09 22:53:35 UTC
the SQL statement seems to be in the  Mail/SpamAssassin/SQLBasedAddrList.pm which seems to set msgcount to value specified by script:

    my $sql = "UPDATE $self->{tablename} ".
              "SET msgcount = ?, totscore = totscore + ? ".
              "WHERE username = ? AND email = ?";

I haven't tried 4.0/trunk for that, but looking in SVN https://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/SQLBasedAddrList.pm?revision=1903510&view=markup#l361 it seems to use similar logic.

Is there a reason why the SA code specifies the final msgcount score manually?
Sure, original bug leading to that should be found and fixed, but wouldn't letting SQL update the counter avoid such horrific pitfalls, as we presumably always work on message-by-message basis?

E.g. something like:

    my $sql = "UPDATE $self->{tablename} ".
              "SET msgcount = msgcount + 1, totscore = totscore + ? ".
              "WHERE username = ? AND email = ?";
Comment 3 Matija Nalis 2022-11-09 23:18:16 UTC
Interestingly, code has following comment:

"NOTE: This code uses a self referential SQL call (ie set foo = foo + 1) which is supported by most modern database backends, but not everything calling	itself a SQL database."

That might explain historic reasons why it uses "set msgcount = ?" instead of "set msgcount = msgcount + 1", WHERE IT NOT FOR it also using "totscore = totscore + ?" at the same time (in which case, such non-compliant SQL database would fail anyway).

There are 3 places in that file using "msgcount = ?" which could easily be fixed by converting them to "msgcount = msgcount + 1" (and removing that parameter from SQL argument list).

While that would be welcome and make this specific problem go away (or at least reduce its severity by several orders of magnitude), what makes me uneasy is the fact that $entry->{msgcount} (and thus probably whole $entry) gets corrupted (e.g. probably replaced by some other entry. Brings me back nightmares from my previous life about non-reentrant C library functions used in threaded code). 

Or has such $entry corruption bug have been found and fixed since 3.4.6 release?
Comment 4 Paul Stead 2022-11-10 00:39:15 UTC
It's pretty late here, but as I recall, the reason for the "set msgcount = ?" is because if a message is re-learned by AWL, the totscore is adjusted positively/negatively, however the msgcount needs to stay the same to make the calculation work.