SA Bugzilla – Bug 2370
removing non-DB_File support from bayes
Last modified: 2003-08-28 06:46:46 UTC
argh. complicated. Here's the latest patch.
Created attachment 1286 [details] latest ver moved upgrade code to sa-learn as --import switch
Created attachment 1287 [details] latest ver moved upgrade code to sa-learn as --import switch; removed support for non-DB_File modules from BayesStore.
+1
Created attachment 1288 [details] new version I can't import 2.55 databases using that patch. I generated 2.55 databases for DB_File, GDBM_File, NDBM_File, and SDBM_File and then tried to import each and none worked. The biggest problem wwas collisions on the filename between DB_File and GDBM_File. GDBM-created databases from 2.55 can (and do) have the same filename as the DB_File 2.60 databases. So, the tie_db_writable() call fails when there's an old GDBM database. I had about 10 other problems trying to get this working, this version seems to work for me. The only nit: Somewhere during a run of "sa-learn --import", DB_File tries to die GDBM database R/O before the conversion has completed and this produces a warning. Beyond that, it really needs more testing before I'd trust it at all. Changes: - avoid repeating code - copies files temporarily while reading from them - use %new = %old to copy hash (Seems simple now. Ugh...)
This is the GDBM import nit: Cannot open bayes databases /home/quinlan/.spamassassin/bayes_* R/O: tie failed: upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_toks DB_File: no database of that kind found, nothing copied. GDBM_File: copied 48162 entries. upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_seen DB_File: no database of that kind found, nothing copied. GDBM_File: copied 998 entries. (Also note that I added DB_File into the fray since I can't tell the difference between DB_File and GDBM_File dbm files and since you have to have a separate file to do the conversion, et cetera, et cetera, it seemed easiest/best to just treat old DB_File dbm files like the others. I could have tried more complicated logic, but it was hard enough to get this working.) There are no warnings for the other formats: DB: upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_toks DB_File: copied 48162 entries. upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_seen DB_File: copied 998 entries. NDBM: upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_toks DB_File: no database of that kind found, nothing copied. GDBM_File: no database of that kind found, nothing copied. NDBM_File: copied 48162 entries. upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_seen DB_File: no database of that kind found, nothing copied. GDBM_File: no database of that kind found, nothing copied. NDBM_File: copied 998 entries. SDBM: upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_toks DB_File: no database of that kind found, nothing copied. GDBM_File: no database of that kind found, nothing copied. NDBM_File: no database of that kind found, nothing copied. SDBM_File: copied 48162 entries. upgrading to DB_File: /home/quinlan/.spamassassin/old_bayes_seen DB_File: no database of that kind found, nothing copied. GDBM_File: no database of that kind found, nothing copied. NDBM_File: no database of that kind found, nothing copied. SDBM_File: copied 998 entries.
This looks complicated, can't we get RC3 out and plan an RC4 for the DB_file change?
+1 looks ok. what is the reason for the big eval{} block in upgrade_old_dbm_files()? a comment would be cool :)
There was one. I think Dan must have removed it. Basically since we're using DB_File directly, and we're allowing users to run sa without DB_File installed as long as they don't use Bayes, we have to protect use of DB_File.
Yeah, there's a comment for the eval in BayesStore::upgrade_old_dbm_files_trapped but not for BayesStore::upgrade_old_dbm_files (which calls the one above). And I can't find any command which could die() in the latter. Except maybe the use of the File::* modules but they wouldn't need such a big eval block. Or can $self->tie_db_writable() die()? Hmmm... and in BayesStore::upgrade_old_dbm_files_trapped there's an eval inside an eval; I think the outer eval isn't really needed, too. I don't want to stop this patch, if it works it should go in so we can get RC3 out. It just looks a bit fishy and maybe should be revisited later.
Created attachment 1289 [details] newer version - doesn't remove old files - one less level of eval (so we can catch success correctly) - be a bit more informative - copy seen before toks (faster, so user gets some feedback quicker)
why does the upgrade look for DB_File? or are we assuming the user wouldn't know if they used DB_File, so support it just in case? otherwise, +1.
+1.
applied
> why does the upgrade look for DB_File? or are we assuming the user wouldn't > know if they used DB_File, so support it just in case? I already answered this on IRC to Theo's satisfaction, but basically: - easier logic - user wouldn't know - filenames can be same as other DB formats