Bug 3064 - sa-learn should ignore "use_bayes 0" setting
Summary: sa-learn should ignore "use_bayes 0" setting
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 enhancement
Target Milestone: 3.0.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 3208
  Show dependency tree
 
Reported: 2004-02-18 19:25 UTC by Matthew Cline
Modified: 2004-04-28 14:55 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Matthew Cline 2004-02-18 19:25:33 UTC
If the "use_bayes" config option is set to 0, then sa-learn does nothing, even
if you just want it to dump out the database.  Invoking sa-learn should
override the "use_bayes" config option, since it's being invoked explicitly
from the comand line.  That way, you can temporarily disable SA from using
bayes (but still have it process incoming mail) while you fiddle with the bayes db.
Comment 1 Theo Van Dinter 2004-02-25 20:21:34 UTC
the only way to do this, fyi, is to have sa-learn override the conf use_bayes setting.  but while there's an 
argument for ignoring it for sa-learn, there's also the argument to not ignore it for sa-learn ...

imo, if the config says "no", then sa-learn should listen.  sa-learn already overrides bayes_path if --
dbpath is set (which is a big kluge), so now we'd have to have sa-learn override use_bayes ...  how 
much more kluging do we really want to do?

-0.9 from me.
Comment 2 Theo Van Dinter 2004-02-29 22:01:01 UTC
some folks were talking about this on sa-talk.  here's a post I made about it:

I was hoping for others to make comments, but ...

Basically, "use_bayes 0" means don't do bayes.  It's the literal meaning, 
so having some tool kluge around it is bad.  The technically correct 
thing is to aim sa-learn at a temp prefs file with "use_bayes 1" which
then includes your normal user_prefs file.

As I said, I can see the argument both ways, so ...
Comment 3 Justin Mason 2004-02-29 22:36:48 UTC
-1 from me.

I would however be in favour of dying with an error message to make the
situation clear ;)
Comment 4 Daniel Quinlan 2004-02-29 23:48:25 UTC
> I would however be in favour of dying with an error message to make the
> situation clear ;)

I'm not so sure.  Think about it... you have a problem with Bayes and it is
giving false positives or something.  You want to debug it or learn some
messages, but you can't turn off Bayes and use sa-learn to debug at the same
time.  I speak from personal experience, this is a real pain in the neck.

"use_bayes" is used to do several things:

 - stop BAYES rules from being used
 - stop auto-learning from happening ("bayes_auto_learn" also does this)

I can't think of any reason why someone would use "use_bayes" to turn off
manual learning or dumping.

I propose that we change the "use_bayes" option only turn off automatic
activity: checks and auto-learning, but we should ignore the setting for
manual operations.  I agree with Theo that the override should not be done
in sa-learn.  It should just be part of how the option works.
Comment 5 Michael Parker 2004-03-01 08:12:05 UTC
Subject: Re:  sa-learn should ignore "use_bayes 0" setting

On Sun, Feb 29, 2004 at 11:48:26PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> 
> I propose that we change the "use_bayes" option only turn off automatic
> activity: checks and auto-learning, but we should ignore the setting for
> manual operations.  I agree with Theo that the override should not be done
> in sa-learn.  It should just be part of how the option works.
> 

At first I thought, maybe a new config param, use_salearn, but I don't
think we need to complicate things.  I just did a quick run through of
all the places that check use_bayes and it's not many.  I believe we
can get rid of all of the except the ones in:
Bayes::is_scan_available
PerMsgStatus::learn
and of course the Conf.pm stuff.

This will cause use_bayes to be used for only scanning/check and
auto-learning and will allow users to use sa-learn whenever they want,
for whatever they want.

If this is agreeable, I can finish up this patch and commit.

Michael

Comment 6 Theo Van Dinter 2004-03-01 10:08:02 UTC
Subject: Re:  sa-learn should ignore "use_bayes 0" setting

On Mon, Mar 01, 2004 at 08:12:06AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> At first I thought, maybe a new config param, use_salearn, but I don't
> think we need to complicate things.  I just did a quick run through of
> all the places that check use_bayes and it's not many.  I believe we
> can get rid of all of the except the ones in:
> Bayes::is_scan_available
> PerMsgStatus::learn
> and of course the Conf.pm stuff.

Well, the other problem with this, of course, is that this is great
for us, but what about the people who use our modules directly?  I, for
instance, have code that calls M::SA->learn() directly.  Should that work
if the config says "use_bayes 0"?  IMO, it's up to the modules to check,
not my calling code.

I think we'd also want to alert people if using sa-learn w/ "use_bayes 0"
so that they know the learning won't be used by default, or for users
who think bayes is enabled, but the admin disabled it ...

Should "spamassassin -r" and "-k" still teach the Bayes system in
this case, and how is it related to bayes_learn_during_report?

etc ...  There's a lot of touchy issues here IMO.

> If this is agreeable, I can finish up this patch and commit.

So far we're -1.9 and +2, fyi.  I'm leaning more to a -1 now, actually.
So make that -2 and +2 overall.

Comment 7 Michael Parker 2004-03-01 10:19:36 UTC
Subject: Re:  sa-learn should ignore "use_bayes 0" setting

> Well, the other problem with this, of course, is that this is great
> for us, but what about the people who use our modules directly?  I, for
> instance, have code that calls M::SA->learn() directly.  Should that work
> if the config says "use_bayes 0"?  IMO, it's up to the modules to check,
> not my calling code.

See, I knew I was missing something, and I do the same thing, so no
excuse for missing it I guess.  You're right, the callers should not
have to perform the check themselves.

> I think we'd also want to alert people if using sa-learn w/ "use_bayes 0"
> so that they know the learning won't be used by default, or for users
> who think bayes is enabled, but the admin disabled it ...
> 
> Should "spamassassin -r" and "-k" still teach the Bayes system in
> this case, and how is it related to bayes_learn_during_report?
> 
> etc ...  There's a lot of touchy issues here IMO.

My overall position is that all of this is unnecessary, but I think
that might be coming from someone that finds Bayes invaluable and will
never have it turned off.  I can see where you'd want to have it
turned off but still "usable," for some definition of usable, via
sa-learn, so I'm willing to talk about options.

How about something that gets set in sa-learn so we still perform the
check but it's like:

if (!use_bayes) {
 if (special_sa-learn_var) {
   warn "You're using sa-learn but bayes is currently turned off.";
 }
 else {
   return;
}

The special_sa-learn_var could also possibly be another config
variable (use_sa-learn) or something like that.

Michael

Comment 8 Theo Van Dinter 2004-03-01 10:20:18 UTC
quinlan said:
""use_bayes" is used to do several things:

 - stop BAYES rules from being used
 - stop auto-learning from happening ("bayes_auto_learn" also does this)

I can't think of any reason why someone would use "use_bayes" to turn off
manual learning or dumping."

you're view is a little bit off here.  "use_bayes" tells the SA code to not use the Bayes code, any Bayes 
DBs, etc.  It's an on/off flag for the whole thing.  As part of that of course, rules and autolearning are 
disabled since the whole thing is being disabled.

quinlan:
"I propose that we change the "use_bayes" option only turn off automatic
activity: checks and auto-learning, but we should ignore the setting for
manual operations.  I agree with Theo that the override should not be done
in sa-learn.  It should just be part of how the option works."

-1 per my other mail.


I think there are really three options here:

1) We let sa-learn override use_bayes.

2) We tell people that in debugging situations, make a temp config ala:
use_bayes 1
include_config /path/to/something

and do "sa-learn -p temp_config"

3) add in more configuration options for the different pieces of bayes.  ie: use_bayes_scan, 
use_bayes_learn, etc.  IMO, this is a good option since we already break out some options like this 
anyway (bayes_auto_learn, bayes_learn_to_journal, bayes_auto_expire, bayes_learn_during_report, etc.)


I'd be +1 for #2 or #3, but -1 for #1.
Comment 9 Duncan Findlay 2004-03-01 10:40:41 UTC
I'm -0.7 on the original propsal (#1 in theo's list)

I'd prefer sa-learn to die with an appropriate message if use_bayes is 0.
Comment 10 Theo Van Dinter 2004-03-01 10:49:35 UTC
Subject: Re:  sa-learn should ignore "use_bayes 0" setting

On Mon, Mar 01, 2004 at 10:40:42AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I'd prefer sa-learn to die with an appropriate message if use_bayes is 0.

It already does, BTW: (from CmdLearn)

  if ( !$spamtest->{conf}->{use_bayes} ) {
    warn "ERROR: configuration specifies 'use_bayes 0', sa-learn disabled\n";
    return 1;
  }


Comment 11 Daniel Quinlan 2004-03-01 11:15:25 UTC
> you're view is a little bit off here.  "use_bayes" tells the SA code
> to not use the Bayes code, any Bayes DBs, etc.  It's an on/off flag
> for the whole thing.  As part of that of course, rules and
> autolearning are disabled since the whole thing is being disabled.

I was talking about how and why *users* use the option and *not* how it
works in the code.  They are two different things, as evidenced by the
amount of confusion and consternation that the current implementation
causes.

> Well, the other problem with this, of course, is that this is great
> for us, but what about the people who use our modules directly?  I,
> for instance, have code that calls M::SA->learn() directly.  Should
> that work if the config says "use_bayes 0"?  IMO, it's up to the
> modules to check, not my calling code.

I agree that the configuration should be what matters, but I think
having separate options for learning and checking (at least) would be
useful and a major improvement.  Nobody has recently suggested that we
have any command-line options override these settings.

> So far we're -1.9 and +2, fyi.  I'm leaning more to a -1 now, actually.
> So make that -2 and +2 overall.

Note that any -1 is a veto for a code modification.  I don't think we're
at the point where voting is meaningful yet.
Comment 12 Daniel Quinlan 2004-03-01 11:56:05 UTC
Let's forget how the current option works for a moment.  I don't care where
it is currently and what it does.  I want to figure out a better way and once
we agree on that, then we can decide if and how the old option figures in.

Things that people *may* want to turn on or off in Bayes:

* everything
  * auto learning
  * manual learning and forgetting
  * checking
  * dumping
  * expiry and perhaps other maintenance

I think that can be collapsed for clarity and brevity as follows:

* everything
  * auto learning
  * manual operations: manual learning, forgetting, manual expiry, dumping,
    and perhaps other maintenance
  * checking

I think those three sub-options are orthogonal.

If "use_bayes" continues to mean "Bayes is off off off off!", then that
suggests two sub-options are needed for when use_bayes is not off:

  - turn off auto learning
  - turn off checking
  - and perhaps turn off manual operations, although I think someone would
    just turn off Bayes at that point
Comment 13 Theo Van Dinter 2004-03-01 21:33:32 UTC
Subject: Re:  sa-learn should ignore "use_bayes 0" setting

On Mon, Mar 01, 2004 at 11:56:06AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> If "use_bayes" continues to mean "Bayes is off off off off!", then that
> suggests two sub-options are needed for when use_bayes is not off:
> 
>   - turn off auto learning
>   - turn off checking
>   - and perhaps turn off manual operations, although I think someone would
>     just turn off Bayes at that point

so in short...

use_bayes 0 or 1	(dis|en)ables everything
bayes_auto_learn 0 or 1	(dis|en)ables autolearning
use_bayes_rules 0 or 1	(dis|en)ables checks

on the assumption that if you disable manual operations, you want to
disable the whole thing.

... and we already have the first two. :)  I'd be fine with adding the
third, although a better name would be good (bayes_rules_enable?).

Comment 14 Justin Mason 2004-04-27 20:35:56 UTC
I'm +1 on Theo's suggestions.  and I think "use_bayes_rules" works for me....
Comment 15 Theo Van Dinter 2004-04-28 22:55:14 UTC
ok.   "use_bayes_rules" implemented and committed.  r10413