Bug 3940 - ArchiveIterator uses opt_j for two different things
Summary: ArchiveIterator uses opt_j for two different things
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 normal
Target Milestone: 3.1.0
Assignee: Theo Van Dinter
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-10-29 12:04 UTC by Theo Van Dinter
Modified: 2004-10-31 12:39 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Theo Van Dinter 2004-10-29 12:04:06 UTC
In the same vein as bug 3936, opt_j has two meanings as well.  First, it's used to specify how many 
children processes should be running the messages during a mass-check (it's a single process 
everywhere else).  Second, it specifies whether to keep the message list in memory (1 process), or fork() 
off a child and generate the list in memory and then store in a temp file for actual processing.

I'd like to break those two apart, and keep opt_j as a backward compatibility option.
Comment 1 Justin Mason 2004-10-29 12:51:19 UTC
suggest keeping opt_j for the number of forked processes, and creating a new
setting controlling whether the message list is in-RAM or in tempfile.  then
infer the latter setting from opt_j in mass-check, and add sanity check code to
M:AI if required.
Comment 2 Daniel Quinlan 2004-10-31 14:27:17 UTC
Subject: Re:  New: ArchiveIterator uses opt_j for two different things

> I'd like to break those two apart, and keep opt_j as a backward
> compatibility option.

I'd strongly prefer (I'm probably -1 on creatingh two new options for
this one) to keep opt_j as the number of processes (it parallels "make
-j") and add a new option for the temporary file vs. in-memory option.
The temporary file thing postdates -j by a long period and can just move
to a new option.

Daniel

Comment 3 Theo Van Dinter 2004-10-31 14:57:49 UTC
Subject: Re:  ArchiveIterator uses opt_j for two different things

On Sun, Oct 31, 2004 at 02:27:17PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I'd strongly prefer (I'm probably -1 on creatingh two new options for
> this one) to keep opt_j as the number of processes (it parallels "make
> -j") and add a new option for the temporary file vs. in-memory option.
> The temporary file thing postdates -j by a long period and can just move
> to a new option.

I think just adding a new option for storage is doable, but FWIW I don't
really care about "parallels ... -j" since this is all internal API names.
The commandline can stay the same, but unless you're used to the module,
"opt_j" isn't very descriptive of what the value means.

Comment 4 Justin Mason 2004-10-31 16:25:11 UTC
Subject: Re:  ArchiveIterator uses opt_j for two different things 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> > I'd strongly prefer (I'm probably -1 on creatingh two new options for
> > this one) to keep opt_j as the number of processes (it parallels "make
> > -j") and add a new option for the temporary file vs. in-memory option.
> > The temporary file thing postdates -j by a long period and can just move
> > to a new option.
> 
> I think just adding a new option for storage is doable, but FWIW I don't
> really care about "parallels ... -j" since this is all internal API names.
> The commandline can stay the same, but unless you're used to the module,
> "opt_j" isn't very descriptive of what the value means.

oh btw, on that point, I'd be very pro adding *new*, meaningful names
for opt_j, opt_n et al as they are used in the
M:SpamAssassin:ArchiveIterator class, and leaving opt_j, opt_n et al
as backwards-compat aliases.   I agree, they don't make much sense
for users of that module apart from mass-check.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBhYJGMJF5cimLx9ARAnZWAKCFHqthUt9p7kCQJTqkLsAjBqXWTACZAcd+
lReIi6mhyf165yWgmgAmtJI=
=BMAd
-----END PGP SIGNATURE-----

Comment 5 Daniel Quinlan 2004-10-31 17:57:45 UTC
Subject: Re:  ArchiveIterator uses opt_j for two different things

Well, "jobs" would be the appropriate option name following "make",
here.  (I do try to use the GNU coding standards or some other precedent
when adding a new option or long option.)

Daniel

Comment 6 Theo Van Dinter 2004-10-31 21:04:17 UTC
Been thinking about this some more.  It's really much the same situation as with opt_n...

Right now, we have:

opt_j == 0, 1 process for run, scan occurs in same process and list in memory
opt_j == 1, 1 process for run, scan occurs in separate process, list in temp file
opt_j > 1, j processes for run, scan occurs in separate process, list in temp file

temp file with j == 0 doesn't really make sense since the single process would have already taken the 
memory hit.  no temp file and j == 1 is just j == 0.  no temp file and j > 1 doesn't really make sense 
either since the point of j>1 is higher throughput, so the message list is likely to be large, so you'd 
really want the temp file.

so I'm going to do the same thing as with opt_n.  internally, there'll be an option (I'm using 
run_temp_file) which gets set if opt_j >= 1, and then at least it's cleaner internally.
Comment 7 Theo Van Dinter 2004-10-31 21:39:49 UTC
ok, thought about this some more as I tried implementing things, and I found there really is no good 
way to add in the run_temp_file variable.  it's already done in the code (opt_j > 0), so adding in an 
option actually just makes it more messy without bringing anything else to the table.

so in the end, I think this has been a good examination of the previous decision, so I'm just going to 
close the ticket now.  the opt_j to something else renaming bit can be done in another ticket. :)