SA Bugzilla – Bug 3563
rethink Bayes locking to avoid "db version 0" warnings
Last modified: 2021-04-18 10:31:51 UTC
In one window I have an sa-learn running, learning a few thousand mails. In another window, I do "sa-learn --dump magic -D" and get the debug output at the bottom of this comment. Notice it says the bayes db version is 0, but the bayes db is definitely version 3. It seems as if the tie works, but then trying to read the db fails. If this is this is the case, then we need to essentially lock the db for reading (or have a collective read lock anyway) too, otherwise the data we get could be corrupted. debug: bayes: 18360 tie-ing to DB file R/O /home/felicity/.spamassassin/bayes_toks debug: bayes: 18360 tie-ing to DB file R/O /home/felicity/.spamassassin/bayes_seen debug: bayes: found bayes db version 0 debug: bayes: bayes db version 0 is not able to be used, aborting! debug: Score set 0 chosen. debug: bayes: 18360 tie-ing to DB file R/O /home/felicity/.spamassassin/bayes_toks debug: bayes: 18360 tie-ing to DB file R/O /home/felicity/.spamassassin/bayes_seen debug: bayes: found bayes db version 0 debug: bayes: bayes db version 0 is not able to be used, aborting! ERROR: Bayes dump returned an error, please re-run with -D for more information
Seems to be related to Debian bug #308303, which can be seen at the URL above. It seems in 308303, there's no dumping going on, though perhaps spamd is accessing the database.
After a little bit of thinking about this, I think what we'll have to do is the multiple lock thing. Something like: - if a file-based DB (DBM, etc,) and - lock_method flock when tieing read only, lock the DB with LOCK_SH. when tieing read/write, lock the DB with LOCK_EX (current behavior). Having lock_method as anything else means we'd always have to behave ala LOCK_EX, which really just kills Bayes usage, so I'd rather deal with the occasional odd behavior. One issue with this plan is that I don't know if there's a lock queue, which could potentially mean our LOCK_EX for writes may consistently time out if reads continue to keep the shared lock in place. It's probably platform dependent. Alternately, we punt this and the answer is "this behavior can happen, if you want to avoid it use SQL". Thoughts?
what kind of db format is this? I thought Berkeley DB files could deal with this....
(In reply to comment #3) > what kind of db format is this? > I thought Berkeley DB files could deal with this.... It's BerkeleyDB.
if we ever get movement on this, let's backport to 3.1
pushing out to 3.3.0, since I don't think it's a 3.2.0 blocker. shout (or change the milestone) if you disagree....
moving most remaining 3.3.0 bugs to 3.3.1 milestone
reassigning, too
moving all open 3.3.1 bugs to 3.3.2
Moving back off of Security, which got changed by accident during the mass Target Milestone move.
I think this error is due to something else entirely. I've created a separate bug here: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6901
Closing as Bug 6901 supposedly fixed this.