4163 – RFE: add reporting support to spamd/spamc

Bug 4163 - RFE: add reporting support to spamd/spamc

Summary: RFE: add reporting support to spamd/spamc

Status:	RESOLVED FIXED

Alias:	None

Product:	Spamassassin
Classification:	Unclassified
Component:	spamc/spamd (show other bugs)
Version:	SVN Trunk (Latest Devel Version)
Hardware:	Other All

Importance:	P5 normal
Target Milestone:	3.1.0
Assignee:	SpamAssassin Developer Mailing List

URL:
Whiteboard:
Keywords:	triage

Duplicates (1):	4302 (view as bug list)
Depends on:
Blocks:

Reported:	2005-03-01 12:52 UTC by Nico Prenzel
Modified:	2005-05-07 01:09 UTC (History)
CC List:	1 user (show)

Attachment	Type	Actions	Submitter/CLA Status
First throw of ListsReport for spamc part	patch	None	Nico Prenzel
Second throw	patch	None	Nico Prenzel
Patch File	patch	None	Michael Parker
Show Obsolete (2) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nico Prenzel 2005-03-01 12:52:50 UTC

I think it would make sense to introduce the report message as spam 
(spamassassin --report) feature into spamc/spamd learning part.

So spamc -d xxx.xxx.xxx.xxx -z < SpamMessage.txt, would trigger report to 
razor and others.

Comment 1 Nico Prenzel 2005-03-01 13:04:31 UTC

Oh sorry, i meant not into learning part.
Reporting should be a seperate feature, not combined with learning.
It's time to go to bed :-)

Comment 2 Bob Menschel 2005-04-30 22:51:50 UTC

Nico, how would you get around the FP problem? As I understand it, reporting
spam to various lists/services should always be done manually, to avoid
accidentally sending them non-spam. Since spamc/spamd is automated, I don't see
how to avoid these errors.

Comment 3 Nico Prenzel 2005-05-02 00:01:08 UTC

Hello Bob,

I think you've misunderstood my request.

My aim is to introduce an reporting capability usable by spamc/spamd, so 
anybody who runs only spamc/spamd could be able to report spam to various 
lists/services.

This idea has it's origin by the bug #1201 there we introduced the spamc/spamd 
learning capability.

Pherhaps I've misunderstood you. Any thoughts?

NicoP.

Comment 4 Nico Prenzel 2005-05-05 08:28:56 UTC

(In reply to comment #2)
> Nico, how would you get around the FP problem? As I understand it, reporting
> spam to various lists/services should always be done manually, to avoid
> accidentally sending them non-spam. Since spamc/spamd is automated, I don't 
see
> how to avoid these errors. 

Also not every environment uses spamc/spamd (always) automated. Due to the C-API 
of spamc it would be a nice feature to introduce.

I would like to start the spamc-part. But before I want to discuss the usage.
I think it would make sense to make reporting and learning combineable, so only 
one time the message has to be transfered to spamd. But too I want to make it an 
single useable feature.
Or should it be only configureable at the spamd site. So spamd would decide to 
report the message to razor...? So it could be user-dependet (sql database). 

What do you think?

Comment 5 Michael Parker 2005-05-05 08:52:26 UTC

Subject: Re:  RFE: add reporting support to spamd/spamc

> 
> I would like to start the spamc-part. But before I want to discuss the usage.
> I think it would make sense to make reporting and learning combineable, so only 
> one time the message has to be transfered to spamd. But too I want to make it an 
> single useable feature.
> Or should it be only configureable at the spamd site. So spamd would decide to 
> report the message to razor...? So it could be user-dependet (sql database). 

Please feel free to get going on the spamc part, please note that I
had to backtrack on some of the spamc portions for learning to keep it
binary compatible, so please keep that that in mind.

You'll want to support reporting and revoking, unfortunately -r is
already taken as an option (it is time to move to long options for
spamc?), so do your best to come up with something sane.

No need to somehow combine the efforts, since reporting automatically
triggers a learn as spam and revoking automatically triggers a learn
as ham.

Lets not overthink the user dependent stuff, and just implement the
same way as we did with learn, if someone wants to add the ability to
turn things off later then we'll add it then.

I don't think the spamd portion will be too hard, but it might be a
bit too ambitious for 3.1, but we'll see.

Michael

Comment 6 Michael Parker 2005-05-05 09:03:42 UTC

Subject: Re:  RFE: add reporting support to spamd/spamc

Also, the REPORT command for the spamd protocol is taken so we'll have
to come up with something else.

Comment 7 Nico Prenzel 2005-05-05 10:37:24 UTC

Hello Michael,

the spamc part is nearly complete!

The protocol is:
LISTSREPORT SPAMC/1.3
ListsReport-type: 0
User: testuser

ListsReport-type: 0 defines report message as spam
ListsReport-type: 1 defines revoke message (if ham)

What would be a suitable return (code) from spamd?

Do I've missed something?

Comment 8 Michael Parker 2005-05-05 22:14:37 UTC

Subject: Re:  RFE: add reporting support to spamd/spamc

> ------- Additional Comments From nico.prenzel@pn-systeme.de  2005-05-05 10:37 -------
> 
> the spamc part is nearly complete!
> 
> The protocol is:
> LISTSREPORT SPAMC/1.3
> ListsReport-type: 0
> User: testuser
> 
> ListsReport-type: 0 defines report message as spam
> ListsReport-type: 1 defines revoke message (if ham)
> 

To tell you the truth, I don't much like the name, but I honestly
can't think of anything better.  Otherwise it looks good.

> What would be a suitable return (code) from spamd?
> 

We could go simple and do something like:
Reported: Yes
Reported: No

or slightly more complicated:
Reported: Yes
Reported: No

Revoked: Yes
Revoked: No


Standard return codes for the first line of the response.

> Do I've missed something?

Did you pick out command line options?

Michael

Comment 9 Nico Prenzel 2005-05-06 03:26:56 UTC

(In reply to comment #5)
> Subject: Re:  RFE: add reporting support to spamd/spamc
> 
> > 
> > I would like to start the spamc-part. But before I want to discuss the 
usage.
> > I think it would make sense to make reporting and learning combineable, so 
only 
> > one time the message has to be transfered to spamd. But too I want to make 
it an 
> > single useable feature.
> > Or should it be only configureable at the spamd site. So spamd would decide 
to 
> > report the message to razor...? So it could be user-dependet (sql database). 
> 
> Please feel free to get going on the spamc part, please note that I
> had to backtrack on some of the spamc portions for learning to keep it
> binary compatible, so please keep that that in mind.
> 
> You'll want to support reporting and revoking, unfortunately -r is
> already taken as an option (it is time to move to long options for
> spamc?), so do your best to come up with something sane.
> 
> No need to somehow combine the efforts, since reporting automatically
> triggers a learn as spam and revoking automatically triggers a learn
> as ham.
> 
> Lets not overthink the user dependent stuff, and just implement the
> same way as we did with learn, if someone wants to add the ability to
> turn things off later then we'll add it then.
> 
> I don't think the spamd portion will be too hard, but it might be a
> bit too ambitious for 3.1, but we'll see.
> 
> Michael
> 

As I consider this again. I think it won't be a good idea to learn automatically 
if triggered for report/revoke.
- report/revoke would produce some (network) overhead if done everytime a 
message is learned (especially no one remembers that report/revoke happens when 
learning is done)
- no need for a seperate spamc switch (-L already implemented)

I'am not amused not to overthink the user part.
Please introduce to query the 'use_Razor2' (and other) settings from SQL/
userpref and use that for reporting!? I think it should be!
Sorry, but I've another opinion about that :-)

>We could go simple and do something like:
>Reported: Yes
>Reported: No

I vote for that. +1 :-)

>Did you pick out command line options?

Not as you considered. I've simply add an switch (-W) to spamc! Too, there are 
some others free. Not perfectly but it works.

Comment 10 Michael Parker 2005-05-06 05:29:26 UTC

Subject: Re:  RFE: add reporting support to spamd/spamc

> ------- Additional Comments From nico.prenzel@pn-systeme.de  2005-05-06 03:26 -------
> 
> As I consider this again. I think it won't be a good idea to learn automatically 
> if triggered for report/revoke.
> - report/revoke would produce some (network) overhead if done everytime a 
> message is learned (especially no one remembers that report/revoke happens when 
> learning is done)
> - no need for a seperate spamc switch (-L already implemented)

> I'am not amused not to overthink the user part.
> Please introduce to query the 'use_Razor2' (and other) settings from SQL/
> userpref and use that for reporting!? I think it should be!
> Sorry, but I've another opinion about that :-)

The code will use all of the same mechanisms that it does now, so
there is nothing to think about here.  If -L is specified then it
won't be available, period.  If use_razor2 is set to 0 then we won't
report/revoke fomr razor, so on and so forth.

Learning either as spam for reporting or as ham for revoking is just
how that code works.

Like I said, I think you are overthinking.


> 
> >We could go simple and do something like:
> >Reported: Yes
> >Reported: No
> 
> I vote for that. +1 :-)
> 
> >Did you pick out command line options?
> 
> Not as you considered. I've simply add an switch (-W) to spamc! Too, there are 
> some others free. Not perfectly but it works.

Works for me, it is easily changed if for some reason it doesn't work
out.

FYI, here is the output I just got from spamd:
2329] dbg: markup: removing markup
[2329] dbg: plugin: Mail::SpamAssassin::Plugin::Razor2=HASH(0x8dadbd4) implements 'plugin_revoke'
[2329] dbg: info: entering helper-app run mode
[2329] dbg: info: leaving helper-app run mode
[2329] dbg: reporter: spam revoked from Razor
[2329] info: spamd: revoked ham message for parker:1000 in 0.7 seconds, 2569 bytes

Michael

Comment 11 Nico Prenzel 2005-05-06 06:48:19 UTC

Created attachment 2849 [details]
First throw of ListsReport for spamc part

Comment 12 Nico Prenzel 2005-05-06 06:54:53 UTC

>>Learning either as spam for reporting or as ham for revoking is just how that 
code works.

Okay, for me learning is always a learn to bayesian. So, I think we've 
misunderstood each other. I thought you wanted to learn each reported/revoked 
message automatically to the bayesian database. If that isn't your intention all 
is right!

Sorry for the confusion, but a question should be legitimately.

Greetings NicoP.

Comment 13 Nico Prenzel 2005-05-06 07:00:54 UTC

Created attachment 2850 [details]
Second throw

forgot to implement exclusion!

Comment 14 Michael Parker 2005-05-06 07:13:05 UTC

Subject: Re:  RFE: add reporting support to spamd/spamc

> 
> ------- Additional Comments From nico.prenzel@pn-systeme.de  2005-05-06 06:54 -------
> >>Learning either as spam for reporting or as ham for revoking is just how that 
> code works.
> 
> Okay, for me learning is always a learn to bayesian. So, I think we've 
> misunderstood each other. I thought you wanted to learn each reported/revoked 
> message automatically to the bayesian database. If that isn't your intention all 
> is right!
> 
> Sorry for the confusion, but a question should be legitimately.

I think we're having a communication problem.  Please check out the
options in spamassassin for -r,--report and -k,--revoke:

--report:
...
           The message will also be submitted to SpamAssassin's
           learning systems; currently this is the internal Bayesian
           statistical-filtering system (the BAYES rules).  (Note that
           if you only want to perform statistical learning, and do
           not want to report mail to third-parties, you should use
           the "sa-learn" command directly instead.)

and -revoke:
...
           The message will also be submitted as 'ham' (non-spam) to
           SpamAssassin's learning systems; currently this is the
           internal Bayesian statistical-filtering system (the BAYES
           rules).  (Note that if you only want to perform statistical
           learning, and do not want to report mail to third-parties,
           you should use the "sa-learn" command directly instead.)


So, report and revoke always learn, unless bayes_learn_during_report
is set to 0.

Michael

Comment 15 Nico Prenzel 2005-05-06 07:34:31 UTC

Thanks Michael,

now I know why you want the learning combined with reporting. I didn't ever use 
the spamassassin command. So it wasn't clear to me that a 
bayes_learn_during_report parameter exists.

Ok, this doesn't need anything changed to previous posted patch.

If a patch to spamd is posted I'll check that new functionality.

Until then, NicoP.

Comment 16 Michael Parker 2005-05-06 11:30:04 UTC

Created attachment 2851 [details]
Patch File

Here is a combined version of the patch.  You'll notice several things that
have changed:

1) The spamd protocol command is now COLLABREPORT (for collaborative filtering
databases).  Other things have changed to follow, the header, the method name,
vars, etc.

2) Bunch of cleanup in the spamc C code.

3) There is a perl based client method for report and revoke

4) There is a start of a test but I don't have it like I want it yet

One problem is that even if you have -L turned on or all reporting methods
returned off spamd will still return a Reported: Yes header to the client, this
isn't right so I need to fix that.

However, in the interest of getting the code out there for eyeballs and the
fact that I might not be able to work on it again for a few days, here is my
latest patch.  Feel free to make improvements, and if you do be sure to
document where those improvements were so I can fold them into anything else
I'm working on.

Michael

Comment 17 Michael Parker 2005-05-06 11:31:55 UTC

I've been making heavy use of the learn via spamd feature, I'm sure this will
also come in handy and since it's 90% of the way there it should be good for 3.1.

Comment 18 Michael Parker 2005-05-06 12:28:28 UTC

There are a couple of issues with the tests, one spamc doesn't return anything
in the failure case, fixed in an upcoming patch, Reporter.pm defaults to
everything being ok for report, hopefully fixed in an upcoming patch, and the
default init.pre doesn't have Razor turned on, so need to come up with a
solution for this, I'm guessing this is also why the razor2.t tests have been
failing for me as of late.

Comment 19 Michael Parker 2005-05-06 13:38:25 UTC

Much better idea for the t/spamc_optC.t test, I'll just create a quick reporter
plugin, then we'll turn everything off and just use that plugin to test the
capabilities.  Have to do something fancy to test the failure case but it
shouldn't be too hard.

Comment 20 Michael Parker 2005-05-07 08:54:34 UTC

*** Bug 4302 has been marked as a duplicate of this bug. ***

Comment 21 Michael Parker 2005-05-07 09:09:43 UTC

Committed revision 169089.