SA Bugzilla – Bug 3940
ArchiveIterator uses opt_j for two different things
Last modified: 2004-10-31 12:39:49 UTC
In the same vein as bug 3936, opt_j has two meanings as well. First, it's used to specify how many children processes should be running the messages during a mass-check (it's a single process everywhere else). Second, it specifies whether to keep the message list in memory (1 process), or fork() off a child and generate the list in memory and then store in a temp file for actual processing. I'd like to break those two apart, and keep opt_j as a backward compatibility option.
suggest keeping opt_j for the number of forked processes, and creating a new setting controlling whether the message list is in-RAM or in tempfile. then infer the latter setting from opt_j in mass-check, and add sanity check code to M:AI if required.
Subject: Re: New: ArchiveIterator uses opt_j for two different things > I'd like to break those two apart, and keep opt_j as a backward > compatibility option. I'd strongly prefer (I'm probably -1 on creatingh two new options for this one) to keep opt_j as the number of processes (it parallels "make -j") and add a new option for the temporary file vs. in-memory option. The temporary file thing postdates -j by a long period and can just move to a new option. Daniel
Subject: Re: ArchiveIterator uses opt_j for two different things On Sun, Oct 31, 2004 at 02:27:17PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote: > I'd strongly prefer (I'm probably -1 on creatingh two new options for > this one) to keep opt_j as the number of processes (it parallels "make > -j") and add a new option for the temporary file vs. in-memory option. > The temporary file thing postdates -j by a long period and can just move > to a new option. I think just adding a new option for storage is doable, but FWIW I don't really care about "parallels ... -j" since this is all internal API names. The commandline can stay the same, but unless you're used to the module, "opt_j" isn't very descriptive of what the value means.
Subject: Re: ArchiveIterator uses opt_j for two different things -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > > I'd strongly prefer (I'm probably -1 on creatingh two new options for > > this one) to keep opt_j as the number of processes (it parallels "make > > -j") and add a new option for the temporary file vs. in-memory option. > > The temporary file thing postdates -j by a long period and can just move > > to a new option. > > I think just adding a new option for storage is doable, but FWIW I don't > really care about "parallels ... -j" since this is all internal API names. > The commandline can stay the same, but unless you're used to the module, > "opt_j" isn't very descriptive of what the value means. oh btw, on that point, I'd be very pro adding *new*, meaningful names for opt_j, opt_n et al as they are used in the M:SpamAssassin:ArchiveIterator class, and leaving opt_j, opt_n et al as backwards-compat aliases. I agree, they don't make much sense for users of that module apart from mass-check. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFBhYJGMJF5cimLx9ARAnZWAKCFHqthUt9p7kCQJTqkLsAjBqXWTACZAcd+ lReIi6mhyf165yWgmgAmtJI= =BMAd -----END PGP SIGNATURE-----
Subject: Re: ArchiveIterator uses opt_j for two different things Well, "jobs" would be the appropriate option name following "make", here. (I do try to use the GNU coding standards or some other precedent when adding a new option or long option.) Daniel
Been thinking about this some more. It's really much the same situation as with opt_n... Right now, we have: opt_j == 0, 1 process for run, scan occurs in same process and list in memory opt_j == 1, 1 process for run, scan occurs in separate process, list in temp file opt_j > 1, j processes for run, scan occurs in separate process, list in temp file temp file with j == 0 doesn't really make sense since the single process would have already taken the memory hit. no temp file and j == 1 is just j == 0. no temp file and j > 1 doesn't really make sense either since the point of j>1 is higher throughput, so the message list is likely to be large, so you'd really want the temp file. so I'm going to do the same thing as with opt_n. internally, there'll be an option (I'm using run_temp_file) which gets set if opt_j >= 1, and then at least it's cleaner internally.
ok, thought about this some more as I tried implementing things, and I found there really is no good way to add in the run_temp_file variable. it's already done in the code (opt_j > 0), so adding in an option actually just makes it more messy without bringing anything else to the table. so in the end, I think this has been a good examination of the previous decision, so I'm just going to close the ticket now. the opt_j to something else renaming bit can be done in another ticket. :)