Bug 2847 - [review] sa-learn doesn't complain if DB_File not installed
Summary: [review] sa-learn doesn't complain if DB_File not installed
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: 2.61
Hardware: Other other
: P4 enhancement
Target Milestone: 2.62
Assignee: Theo Van Dinter
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-12-16 14:25 UTC by Martin Radford
Modified: 2003-12-28 00:35 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
possible patch patch None Theo Van Dinter [HasCLA]
suggested patch patch None Theo Van Dinter [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Radford 2003-12-16 14:25:04 UTC
Finally fed up of the questions on the mailing list, I thought
I'd stick this in the bugzilla in the hope it can be fixed :-)

Frequently when users complain that sa-learn doesn't learn, it turns
out that they haven't installed DB_File.  Can sa-learn be modified
to check to see if DB_File is installed, and exit early if it's not 
there, together with a meaningful error message?
Comment 1 Theo Van Dinter 2003-12-16 14:32:25 UTC
sounds like something that would be good for 2.62.
Comment 2 Theo Van Dinter 2003-12-16 14:33:42 UTC
although I think it would be better to warn if bayes is attempted and DB_File isn't available, since 
learning isn't sa-learn specific.
Comment 3 Martin Radford 2003-12-16 15:18:25 UTC
I wasn't sure whether to enter the following additional stuff as a
separate bugzilla entry, but on the subject of bayes and DB_File,
it might also be worth looking at the "autolearn=" part of the 
X-Spam-Status header.

For example, if learning is not possible (e.g. no DB_File) or
disabled (perhaps in the config file), then perhaps "autolearn="
should not be present (or perhaps should say "autolearn=disabled");
this would aid in diagnosing problems on the mailing list.  I don't
know how far the developers might want to go with this approach,
though.
Comment 4 Theo Van Dinter 2003-12-16 15:41:06 UTC
Subject: Re: [SAdev]  sa-learn doesn't complain if DB_File not installed

On Tue, Dec 16, 2003 at 04:13:48PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> For example, if learning is not possible (e.g. no DB_File) or
> disabled (perhaps in the config file), then perhaps "autolearn="
> should not be present (or perhaps should say "autolearn=disabled");
> this would aid in diagnosing problems on the mailing list.  I don't
> know how far the developers might want to go with this approach,
> though.

imho, if we did disabled that would be a different than "not available",
etc.  I don't think we should be putting debug statements in message
headers, that's what -D is for.  ;)

Comment 5 Martin Radford 2003-12-17 04:54:00 UTC
> imho, if we did disabled that would be a different than "not available",
> etc.  I don't think we should be putting debug statements in message
> headers, that's what -D is for.  ;)

I do take your point on that, but I'd argue that if learning isn't possible 
(whether it's because it's disabled, or because DB_File isn't present), then 
the "autolearn=" should not appear in X-Spam-Status.

From a user support point-of-view, for them to say, "I don't get autolearn= in 
my X-Spam-Status header" would be an indication that there's a problem that 
could be identified via -D (and/or --lint).  If the user's posting indicates 
that they are getting "autolearn=" then that would rule out whether or not 
Bayes is capable of learning.  You're right that putting too much stuff in the 
headers would be confusing, I accept that.

Anyway, I'll leave it up to the developers.
Comment 6 Theo Van Dinter 2003-12-17 06:25:07 UTC
Subject: Re: [SAdev]  sa-learn doesn't complain if DB_File not installed

On Wed, Dec 17, 2003 at 05:49:22AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I do take your point on that, but I'd argue that if learning isn't possible 
> (whether it's because it's disabled, or because DB_File isn't present), then 
> the "autolearn=" should not appear in X-Spam-Status.

Well, that's not up to the Bayes system, that's part of the default
config files.  "autolearn=no" in those cases is still 100% accurate, just not
useful in telling you why it didn't autolearn, which is what -D is for.

> Anyway, I'll leave it up to the developers.

To be honest, I'm on the fence about this.  I see your POV, but I also
think that people need to run -D if they think something isn't working
right.  Any other devs want to venture an opinion?

FYI: It's an easy enough patch, probably ~3 lines of code in PerMsgStatus ...

Comment 7 Theo Van Dinter 2003-12-17 06:54:13 UTC
Created attachment 1622 [details]
possible patch

if we did want to do this, the patch would look something like this.  (would
need documentation changes as well ... not included.)
Comment 8 Theo Van Dinter 2003-12-17 07:05:18 UTC
Comment on attachment 1622 [details]
possible patch

this patch actually didn't work correctly -- no DB_File would still result in
"no".  I'm applying a fixed version to 2.70 (I've already done the work, so
what the hell).

this still doesn't do anything for sa-learn, fyi.
Comment 9 Theo Van Dinter 2003-12-17 07:08:50 UTC
changing priority for the sa-learn piece, because I do think that needs to 
explicitly tell people that learning isn't available for whatever reason.
Comment 10 Theo Van Dinter 2003-12-17 13:41:36 UTC
Created attachment 1623 [details]
suggested patch

I'm open to suggestions on this, but it works in my testing...

The path through things, while complex basically ends up with
CmdLearn::wanted() getting the result of PerMsgLearner::did_learn() from its
learn call, 1 if it learned, and undefined or 0 if it didn't.  So I co-opted it
and made undefined mean error, and 0 mean "didn't learn" (does not imply that
it tried -- this is the default until an attempt is made).  I only had to
change 1 line to not have it return undef for "no", no biggie.

I then made wanted() check the return status for whether or not it was defined
instead of just whether or not it was true, and die if it's undefined.	since
sa-learn is the only thing that should call CmdLearn::wanted(), I'm not upset
with putting in the die() in the function.  and voila:

ERROR: the Bayes learn function returned an error, please re-run with -D for
more information

it wouldn't be impossible, but is fairly non-trivial to figure out a way to
pass what the error is to the sa-learn level (and I don't want to have
low-level functions spitting out things to STDERR), so the easiest thing is to
have users re-run with -D where it'll tell them what the actual problem is.
Comment 11 Theo Van Dinter 2003-12-17 13:45:40 UTC
changing to review mode.

BTW: if anyone's curious why I moved the $@ check below the "X messages learned" message, I 
figure that we want that to print out (so the user knows what it did do before dieing) and then die.

So what the user sees, for instance, if use_bayes is set to 0 is:

$ sa-learn ......
Learned from 0 message(s) (1 message(s) examined).
ERROR: the Bayes learn function returned an error, please re-run with -D for more information

and if an error (not this one) occurs on the 4th message, they'll know that 3 were learned 
(potentially) and 4 were examined.
Comment 12 Theo Van Dinter 2003-12-18 10:43:47 UTC
fyi: this patch has not yet been applied to 2.70, pending review.
Comment 13 Justin Mason 2003-12-18 11:22:53 UTC
+1.  Looks good to me.

One thing: you should copy $@ before the 2 print statements; they *may*
reset it, AFAIK.
Comment 14 Theo Van Dinter 2003-12-18 12:26:08 UTC
Subject: Re: [SAdev]  [review] sa-learn doesn't complain if DB_File not installed

On Thu, Dec 18, 2003 at 12:18:14PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> One thing: you should copy $@ before the 2 print statements; they *may*
> reset it, AFAIK.

I wondered about that, but "perldoc perlvar" stated that $@ is set by
the most recent eval block which implies that print doesn't change them.

Comment 15 Justin Mason 2003-12-18 13:03:27 UTC
ok, sounds fine then ;) +1
Comment 16 Theo Van Dinter 2003-12-25 20:16:10 UTC
committed to head and 2.60 in SVN.  keeping bug around for now since it's still in incubator.
Comment 17 Theo Van Dinter 2003-12-28 09:35:16 UTC
decided to close it.  the incubator repo will just move over later.