Bug 1625 - sync_journal() needs to check atime before updating
Summary: sync_journal() needs to check atime before updating
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P3 normal
Target Milestone: 2.50
Assignee: Theo Van Dinter
URL:
Whiteboard:
Keywords: backport
Depends on:
Blocks:
 
Reported: 2003-03-12 05:39 UTC by Theo Van Dinter
Modified: 2003-03-16 11:15 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
suggested patch patch None Theo Van Dinter [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Theo Van Dinter 2003-03-12 05:39:38 UTC
In its current implementation, the bayes journal only gets entries to update the atime of tokens which happens during a scan().  This makes sense since the DB is R/O, and we don't need to go updating the atime of all these tokens during what should be a fast process.

However, message/token learns update the atime immediately (which makes sense, the atime is part of the token record, so you may as well update it and the counts at the same time).

However, during a sync_journal(), there's no check of the current atime before a DB update.  So what this means is that scan picks up a token "foo" at count 100.  It writes to the journal "t 100 foo".  Then, autolearning kicks in, but the count is 101 now, so the DB gets updated with foo and atime 101.  Then down the road sync_journal comes along ...  See the problem?

When sync_journal() gets called sometime down the road (currently only whenever an expire run happens or someone specifically calls "sa-learn --rebuild"), the atime of "foo" gets reset to 100.  Now this isn't too big a deal in that case (change of 1 count), but what about a near-worst case:

new db, so last expire is 0
scan() adds "t 100 foo" to the journal
at count 4999, we do a learn which sets token foo to atime 4999
at count 5000, expiry_due() returns true, so we expire automatically.  The journal is synced, and atime for "foo" is set to 100.


This is the byproduct of using the journal for some things and not everything. :(

The solution would be to change BayesStore::tok_touch_token() to see if the current DB atime is >= the new requested atime before it calls tok_put().  aka:

sub tok_touch_token {
  my ($self, $atime, $tok) = @_;
  my ($ts, $th, $oldatime) = $self->tok_get ($tok);

  ## New Check
  return if ($oldatime >= $atime);

  $self->tok_put ($tok, $ts, $th, $atime);
}


I'll put in a patch shortly.
Comment 1 Theo Van Dinter 2003-03-12 05:40:41 UTC
We need to get this into 2.51.  Reassigning to myself to generate the patch. :)
Comment 2 Theo Van Dinter 2003-03-12 09:04:21 UTC
Created attachment 742 [details]
suggested patch
Comment 3 Theo Van Dinter 2003-03-12 09:06:45 UTC
Ok, suggested patch is now attached.  added "backport" keyword. :)
Comment 4 Malte S. Stretz 2003-03-15 10:16:43 UTC
OKAY: I think; applied to STABLE 
Comment 5 Daniel Quinlan 2003-03-15 12:02:05 UTC
Malte: remember to wait 24 hours after the first OKAY before applying patches
to stable per the guidelines.  Waiting 0 minutes is not acceptable.
Comment 6 Justin Mason 2003-03-15 12:59:21 UTC
OKAY
Comment 7 Theo Van Dinter 2003-03-15 15:07:01 UTC
Subject: Re:  sync_journal() needs to check atime before updating

On Sat, Mar 15, 2003 at 12:02:05PM -0800, bugzilla-daemon@hughes-family.org wrote:
> Malte: remember to wait 24 hours after the first OKAY before applying patches
> to stable per the guidelines.  Waiting 0 minutes is not acceptable.

The first OKAY was the upload of the patch by a developer/the message
saying "ok, the patch is uploaded".

Comment 8 Daniel Quinlan 2003-03-15 16:04:02 UTC
> The first OKAY was the upload of the patch by a developer/the message
> saying "ok, the patch is uploaded".

Yes, the okay needs to be from someone who didn't write the patch.  :-)

I'm not opposed to the patch, it looks fine to me too.  The point of the
24 hour period is to allow additional review before applying the patch.

There are at least a few bugs where the delay has proven useful.
Comment 9 Theo Van Dinter 2003-03-16 19:03:03 UTC
Subject: Re:  sync_journal() needs to check atime before updating

On Sat, Mar 15, 2003 at 04:04:03PM -0800, bugzilla-daemon@hughes-family.org wrote:
> Yes, the okay needs to be from someone who didn't write the patch.  :-)

Well, I figure the first OKAY is from the developer submitting the patch.
The second OKAY is from anyone else who says OK. :)

> I'm not opposed to the patch, it looks fine to me too.  The point of the
> 24 hour period is to allow additional review before applying the patch.

Yeah, but the goal is to not rashly apply patches to stable.  So I would
say 24 hours from the patch upload/first OKAY instead of 24hrs from the
second OKAY.

Comment 10 Theo Van Dinter 2003-03-16 20:15:29 UTC
closing the ticket again. :)