SA Bugzilla – Bug 8072
AWL incorrectly resets message count
Last modified: 2022-11-10 00:39:15 UTC
I've had situations when some legit messages suddenly have spam score increase of several hundreds after one email. Trying to debug, I've tracked it down to AWL module (using mysql backend). First looking at the mail log, it seems AWL is changing minimally as expected, and then suddenly it jumps HUGELY (and than trail off slowly again): AWL=0.016 AWL=0.016 AWL=0.016 AWL=-0.085 AWL=0.016 AWL=0.937 AWL=1.406 AWL=1.627 AWL=318.182 AWL=265.033 AWL=176.606 [...] sifting through mysqlbinlog to find what happened I find this piece of evidence: /*!*/; # at 585992025 # at 585992247 #221104 9:54:16 server id 1 end_log_pos 585992247 CRC32 0x3d2066fc Annotate_rows: #Q> UPDATE awl SET msgcount = '5', totscore = totscore + '-0.459' WHERE username = 'amavis' AND email = 'redacted@example.com' AND signedby = 'example-com.20210112.gappssmtp.com' AND i> #221104 9:54:16 server id 1 end_log_pos 585992317 CRC32 0x3463013b Table_map: `spamassassin`.`awl` mapped to number 22 # at 585992317 #221104 9:54:16 server id 1 end_log_pos 585992531 CRC32 0xd7fcc32a Update_rows: table id 22 flags: STMT_END_F ### UPDATE `spamassassin`.`awl` ### WHERE ### @1='amavis' ### @2='redacted@example.com' ### @3='213.191' ### @4=2392 ### @5=3189.25 ### @6='example-com.20210112.gappssmtp.com' ### @7=1667550625 ### SET ### @1='amavis' ### @2='redacted@example.com' ### @3='213.191' ### @4=5 ### @5=3188.79 ### @6='example-com.20210112.gappssmtp.com' ### @7=1667552056 # Number of rows: 1 # at 585992531 #221104 9:54:16 server id 1 end_log_pos 585992562 CRC32 0xff0bfeb5 Xid = 8567075 COMMIT/*!*/; So it seems that while totscore is correctly increased, message count gets reset -- instead of increasing from 2392 to 2393, it gets reset to 5 -- which obviously leads to all further uses to get hugely increased AWL! That is spamassassin 3.4.6-1 from Debian Bullseye.
Some 20 minutes earlier, I find possibly related AWL update - for same email address, but different IP range, the msgcount was increased from 3 to 4 (and the above obviously buggy change sets msgcount to 5: /*!*/; # at 579668809 # at 579669029 #221104 9:34:03 server id 1 end_log_pos 579669029 CRC32 0xff91a46d Annotate_rows: #Q> UPDATE awl SET msgcount = '4', totscore = totscore + '1.39' WHERE username = 'amavis' AND email = 'redacted@example.com' AND signedby = 'example-com.20210112.gappssmtp.com' AND ip > #221104 9:34:03 server id 1 end_log_pos 579669099 CRC32 0xa7d7c1f0 Table_map: `spamassassin`.`awl` mapped to number 22 # at 579669099 #221104 9:34:03 server id 1 end_log_pos 579669313 CRC32 0x460a91e9 Update_rows: table id 22 flags: STMT_END_F ### UPDATE `spamassassin`.`awl` ### WHERE ### @1='amavis' ### @2='redacted@example.com' ### @3='195.190' ### @4=3 ### @5=9.794 ### @6='example-com.20210112.gappssmtp.com' ### @7=1667550842 ### SET ### @1='amavis' ### @2='redacted@example.com' ### @3='195.190' ### @4=4 ### @5=11.184 ### @6='example-com.20210112.gappssmtp.com' ### @7=1667550843 # Number of rows: 1 # at 579669313 #221104 9:34:03 server id 1 end_log_pos 579669344 CRC32 0xb73804b3 Xid = 8550665 COMMIT/*!*/;
the SQL statement seems to be in the Mail/SpamAssassin/SQLBasedAddrList.pm which seems to set msgcount to value specified by script: my $sql = "UPDATE $self->{tablename} ". "SET msgcount = ?, totscore = totscore + ? ". "WHERE username = ? AND email = ?"; I haven't tried 4.0/trunk for that, but looking in SVN https://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/SQLBasedAddrList.pm?revision=1903510&view=markup#l361 it seems to use similar logic. Is there a reason why the SA code specifies the final msgcount score manually? Sure, original bug leading to that should be found and fixed, but wouldn't letting SQL update the counter avoid such horrific pitfalls, as we presumably always work on message-by-message basis? E.g. something like: my $sql = "UPDATE $self->{tablename} ". "SET msgcount = msgcount + 1, totscore = totscore + ? ". "WHERE username = ? AND email = ?";
Interestingly, code has following comment: "NOTE: This code uses a self referential SQL call (ie set foo = foo + 1) which is supported by most modern database backends, but not everything calling itself a SQL database." That might explain historic reasons why it uses "set msgcount = ?" instead of "set msgcount = msgcount + 1", WHERE IT NOT FOR it also using "totscore = totscore + ?" at the same time (in which case, such non-compliant SQL database would fail anyway). There are 3 places in that file using "msgcount = ?" which could easily be fixed by converting them to "msgcount = msgcount + 1" (and removing that parameter from SQL argument list). While that would be welcome and make this specific problem go away (or at least reduce its severity by several orders of magnitude), what makes me uneasy is the fact that $entry->{msgcount} (and thus probably whole $entry) gets corrupted (e.g. probably replaced by some other entry. Brings me back nightmares from my previous life about non-reentrant C library functions used in threaded code). Or has such $entry corruption bug have been found and fixed since 3.4.6 release?
It's pretty late here, but as I recall, the reason for the "set msgcount = ?" is because if a message is re-learned by AWL, the totscore is adjusted positively/negatively, however the msgcount needs to stay the same to make the calculation work.