Bug 1948

Summary: spamd configuration from config file
Product: Spamassassin Reporter: Kelsey Cummings <kgc>
Component: spamc/spamdAssignee: SpamAssassin Developer Mailing List <dev>
Status: NEW ---    
Severity: enhancement CC: parkerm
Priority: P5    
Version: unspecified   
Target Milestone: Future   
Hardware: Other   
OS: other   
Whiteboard: needs discussion
Attachments: Implements spamd config file functionality.
Implements spamd config file functionality.

Description Kelsey Cummings 2003-05-21 12:24:02 UTC
I think spamd would bennifit from using a flat file config instead of cmdline
switches -- my startup is nearly 3 lines long as it is.

This would also allow for some more complex configuration options and syntax
that could be helpful.

Some of these could include code for hashed user prefs dir lookup (since
different sites may want different hashing -- expect a patch to support this
some time in the future.)

I'm assuming this will be a 3.0 milestone item.
Comment 1 Malte S. Stretz 2003-07-08 07:18:19 UTC
I don't understand the hashing part but I thought about the command line  
thingy before (because that looong command line in ps annoyes me).  
  
A possible solution (workaround) which doesn't intordoce Yet another Config  
File I was thinking of was to introduce an environment var SPAMD_ARGS whose  
contents are prepended to @ARGV.  
Comment 2 Kelsey Cummings 2003-07-08 10:11:54 UTC
Personally I dislike using ENV for configuration; the configuration in this case
is out of local.cf so it's not a new config file.  Just extensions of a current one.

What part of the hashing don't you understand?  The basic idea is to avoid a
directory with so many files that lookups get slowed.  The 'a/b/ab' hashing or
leafing is a simple way to do that but it doesn't assure even distribution.

Comment 3 Justin Mason 2003-07-08 12:38:23 UTC
yeah, so there's 2 issues.

1. loading config settings from config file.  strongly agree here -- we have too
much wierd distinctions between setting stuff on the command line and setting
stuff in the config file.  we should move to just mostly using the config file
for spamd config as well as SA config.

2. hashing user dirs.   This is a good idea.  However, adding *yet another*
user-prefs schema, really ties into the concept of a more generalised API for
user prefs storage, which is very relevant to bug 579 -- anomie's patch really
needs to go into 2.70 IMO.   Otherwise we'll keep having more and more variants
on virtual user dirs etc.
Comment 4 Kelsey Cummings 2003-07-08 12:58:19 UTC
Subject: Re:  spamd configuration

On Tue, Jul 08, 2003 at 12:41:03PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> http://bugzilla.spamassassin.org/show_bug.cgi?id=1948
> 
> 
> 
> 
> 
> ------- Additional Comments From jm@jmason.org  2003-07-08 12:38 -------
> yeah, so there's 2 issues.
> 
> 1. loading config settings from config file.  strongly agree here -- we have too
> much wierd distinctions between setting stuff on the command line and setting
> stuff in the config file.  we should move to just mostly using the config file
> for spamd config as well as SA config.

Couldn't agree more.  :)

> 2. hashing user dirs.   This is a good idea.  However, adding *yet another*
> user-prefs schema, really ties into the concept of a more generalised API for
> user prefs storage, which is very relevant to bug 579 -- anomie's patch really
> needs to go into 2.70 IMO.   Otherwise we'll keep having more and more variants
> on virtual user dirs etc.

This is kinda a show stopper right now for us, we've got some resources to
work on it too so we can certainly help.  Patches as is work for us and we
tried to push them into the modules as clean as possible.  In regards to
the other patches, having spamc say, read configs from SQL _is not_ more
secure than having spamd do so since it would open up the potential for
users to view (and possibly modify) other peoples scores.  But I digress.
Nathan and I are really looking forward to an IRC session to discuss this
so we can move forward with our updates here in unison with the rest of the
project.

Comment 5 Kelsey Cummings 2004-04-26 22:17:08 UTC
It seems like it might be wise to make this change to spamd's configuration 
method for 3.0 instead of for a later minor release.  Or put it off together 
for a while.

If you guys would like to push this back up to 3.0, I'll write the patches to 
load the config via a cf file rather than via the cmdline options.
Comment 6 Justin Mason 2004-04-26 23:56:40 UTC
yeah, if we can get patches for this, that'd be great -- probably best to think
up a brief design first to make sure we agree it's the way to go, BTW.

(even something as simple as "/etc/mail/spamassassin/spamd.args is read and
parsed for command line arguments, one per line" would probably be fine by me I
think. but getting more complex -- such as reading the rules files normally in
order to get spamd config -- could be hairy since one of the spamd config items
normally set are the ones that control where config files are loaded from.)

regarding point #2, BTW ("hashing user dirs") -- I think now that we have
plugins, that'd be the best way to implement it.  however the plugin API is not
yet called when *reading* the config, so new hooks to do that would be necessary
(although not hard I don't think).

IMO, both of these can get into 3.0.0 if the patches are sensible and clean.
Comment 7 Kelsey Cummings 2004-04-27 00:13:43 UTC
I think we can leave the hashed user dirs alone for now - it's not relevant for 
us if we go to SQL bayes dbs anyway, which looks like the way for large high 
volume sites to go.  It may be useful for some smaller installtions that are 
looking for some performance enhancements or the ability to spread user db's 
across multiple mount points and/or not store the dbs in a user's homedir.

I was thinking of a config file formatted along the lines of

ListenAddress 0.0.0.0
#Port 783
DefaultConfigPath {insert at install}
SiteConfigPath {insert at install}
#EnableAutoWhiteList 
CreateUserPrefs
#Paranoia
StartProcessors
MinProcessors
MaxProcessors
...



Comment 8 Justin Mason 2004-04-27 00:23:38 UTC
Subject: Re:  spamd configuration from config file 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>I think we can leave the hashed user dirs alone for now - it's not
>relevant for us if we go to SQL bayes dbs anyway, which looks like the
>way for large high volume sites to go.  It may be useful for some smaller
>installtions that are looking for some performance enhancements or the
>ability to spread user db's across multiple mount points and/or not store
>the dbs in a user's homedir.

Yeah, agreed I think. don't worry about that then ;)

>I was thinking of a config file formatted along the lines of
>
>ListenAddress 0.0.0.0
>#Port 783
>DefaultConfigPath {insert at install}
>SiteConfigPath {insert at install}
>#EnableAutoWhiteList 
>CreateUserPrefs
>#Paranoia
>StartProcessors
>MinProcessors
>MaxProcessors
>...

well, I'd suggest just using the *existing* format, e.g.

    --nouser-config
    --auth-ident
    --username=scanner
    ....

That avoids each possible configuration item having 1 representation on
the command line and a different one in the config file, which (in my
experience) causes pain and repeated RTFMing ;)

either way, the "standard" format is "lowercase-lowercase-lowercase"
rather than "BouncyCaps" for the SpamAssassin config settings; it'd
be better to keep that consistent.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAjgpyQTcbUG5Y7woRAthsAKDWMk8RN7WM3iDbObsw7gUfowTOhwCgm7H/
pdisanQH4CzAkHlF56UzAKI=
=2iLB
-----END PGP SIGNATURE-----

Comment 9 Kelsey Cummings 2004-04-27 00:32:08 UTC
Subject: Re:  spamd configuration from config file

> >I was thinking of a config file formatted along the lines of
> >
> >ListenAddress 0.0.0.0
> >#Port 783
> >DefaultConfigPath {insert at install}
> >SiteConfigPath {insert at install}
> >#EnableAutoWhiteList 
> >CreateUserPrefs
> >#Paranoia
> >StartProcessors
> >MinProcessors
> >MaxProcessors
> >...
> 
> well, I'd suggest just using the *existing* format, e.g.
> 
>     --nouser-config
>     --auth-ident
>     --username=scanner
>     ....
> 
> That avoids each possible configuration item having 1 representation on
> the command line and a different one in the config file, which (in my
> experience) causes pain and repeated RTFMing ;)

I had originally thought that most of the cmdline options could be removed,
just leaving the stuff that people need to play with - like debug, and of
course, an optional path to the config file itself.

I'm not sure if forcing everyone to use a config file is a good idea, but
on the other hand, I don't think there is any other server process that I
run that takes all of it's options on the command line.

> either way, the "standard" format is "lowercase-lowercase-lowercase"
> rather than "BouncyCaps" for the SpamAssassin config settings; it'd
> be better to keep that consistent.

Okay.  I was thinking of Apache and other common servers that most admins
are comfortable with.

Comment 10 Theo Van Dinter 2004-04-27 08:35:43 UTC
Subject: Re:  spamd configuration from config file

On Tue, Apr 27, 2004 at 12:23:39AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> either way, the "standard" format is "lowercase-lowercase-lowercase"
> rather than "BouncyCaps" for the SpamAssassin config settings; it'd
> be better to keep that consistent.

fyi, we do lc() on the key in Conf.pm, so people could do Required_Score
if they really wanted to (some keys are required to be lowercase, but
most aren't).

Comment 11 Kelsey Cummings 2004-04-27 10:06:59 UTC
Subject: Re:  spamd configuration from config file

Does anyone have any comments regarding the removal of the bulk of the
cmdline arguments in favor of requiring use of a configuration file?

Comment 12 Theo Van Dinter 2004-04-27 10:16:13 UTC
Subject: Re:  spamd configuration from config file

On Tue, Apr 27, 2004 at 10:07:00AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Does anyone have any comments regarding the removal of the bulk of the
> cmdline arguments in favor of requiring use of a configuration file?

Well, I'm sorta 50/50 on it.  Besides it being somewhat icky for some
configurations, I don't see a reason to shift to a config file.  OTOH,
I agree that every other daemon I can think of uses a config file for
everything except things like debugging, etc.

Then again, you can usually override the config file via the commandline
anyway.  Think of sendmail, postfix, ssh, etc, ie:

% cat .ssh/config
[...]
Host someplace.com
  protocol 2
[...]
% ssh -o 'protocol 1' someplace.com


So I would say -- keep the commandline options, but allow something like
"-f config file" to also deal with the options.  That way you can use the
config file most of the time, override with the commandline as necessary,
and also leave backwards compat. in place.

Comment 13 Michael Parker 2004-04-27 10:30:58 UTC
Subject: Re:  spamd configuration from config file

My $.02

We should keep the command line options and they should override
anything given in the config file.

Provide a '-f configfile' option to specify the config file on the
command line.

My personal preference for config file format would be something that
already has a parser available on CPAN and we just use that.  However,
I understand the dislike for adding yet another dependency, so maybe
if we can just be compatible with something (ie Config::Tiny or
Config::ApacheFormat).

Michael

Comment 14 Justin Mason 2004-04-27 10:54:14 UTC
Subject: Re:  spamd configuration from config file 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>Does anyone have any comments regarding the removal of the bulk of the
>cmdline arguments in favor of requiring use of a configuration file?

I'm -1 on that suggestion.  It'll wreak havoc for boot scripts,
distro packagers, and all sorts of users of spamd.

Of course, we can move the parsing code into a config-file parser
thingy, and just move the existing cmdline parsing into there; then
delegate any unsupported cmdline switches and pass them on into
the config-file parser.

That way, we've (a) cleaned up the code and (b) not broken backwards
compatibility (as long as the switch names remain the same).

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAjp48QTcbUG5Y7woRAjkhAJ0aflYUm/k1417/8kRL+rl5J9v0jQCgnyzP
BHeVNSfKhMj2bUktxzqFGKs=
=TmhN
-----END PGP SIGNATURE-----

Comment 15 Kelsey Cummings 2004-04-27 11:18:53 UTC
Subject: Re:  spamd configuration from config file

On Tue, Apr 27, 2004 at 10:54:15AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> http://bugzilla.spamassassin.org/show_bug.cgi?id=1948
> 
> 
> 
> 
> 
> ------- Additional Comments From jm@jmason.org  2004-04-27 10:54 -------
> Subject: Re:  spamd configuration from config file 
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> >Does anyone have any comments regarding the removal of the bulk of the
> >cmdline arguments in favor of requiring use of a configuration file?
> 
> I'm -1 on that suggestion.  It'll wreak havoc for boot scripts,
> distro packagers, and all sorts of users of spamd.
> 
> Of course, we can move the parsing code into a config-file parser
> thingy, and just move the existing cmdline parsing into there; then
> delegate any unsupported cmdline switches and pass them on into
> the config-file parser.
> 
> That way, we've (a) cleaned up the code and (b) not broken backwards
> compatibility (as long as the switch names remain the same).

Justin, it sounds like you might have some method in mind here.  I'm happy
to give it a shot but if you'd like to give me some sort of spec to work
off of I'm sure my diff's will be better received.

Comment 16 Daryl C. W. O'Shea 2004-10-23 07:36:28 UTC
Created attachment 2479 [details]
Implements spamd config file functionality.

Enables spamd to load configuration from a file if Getopt::ArgvFile is
available.

Defaults to using /$LOCAL_RULE_DIR/spamd.args if present.
Will also ADD config options found in spamd.args file in the calling users home
directory.

Will accept configuration file on command line using @/path/filename.  Will not
use the previous two config files in this case, although they could also be
specified on the command line at the same time, and thus also used.

-k prevents configuration files from being used.  All the C's were taken... k
is for 'klear' configuration.

All existing command line options can still be used on the command line.


--dos
Comment 17 Daryl C. W. O'Shea 2004-10-23 08:25:35 UTC
Created attachment 2480 [details]
Implements spamd config file functionality.

Added line to check version of Getopt::ArgvFile.  1.08 or higher is needed.
Comment 18 Duncan Findlay 2004-11-30 20:48:23 UTC
Would someone care to review this? Also, do we need/have a CLA on file?
Comment 19 Daryl C. W. O'Shea 2004-11-30 20:54:45 UTC
Subject: Re:  spamd configuration from config file

I submitted a CLA on Monday via fax.

There's one possible issue with this patch... the config file is re-read 
every time a child is spawned.  This could be fixed if there's interest 
in using the code.

Changing the patch so that is doesn't _require_ Getopt::ArgvFile, to 
just working (reading a config file) _if_ the module is available would 
probably also be a good idea.

By the way, I've been using the patch since I wrote it with no problems.

Comment 20 Daniel Quinlan 2004-11-30 20:59:09 UTC
-1 adds module dependency, uses different configuration system

I'd prefer to see integration of options between SpamAssassin internals and
spamd.  Having two formats and locations doesn't make sense if we can avoid
it.

spamd_option_name setting
spamd_another_option setting

Using our Conf module would be my leaning.
Comment 21 Daryl C. W. O'Shea 2004-11-30 21:00:43 UTC
Subject: Re:  spamd configuration from config file

Oops, I fixed the issue with requiring Getopt::ArgvFile in the second 
patch.  So it's just the issue of reloading the argument file each time 
a child is spawned.

Daryl

Comment 22 Daryl C. W. O'Shea 2004-11-30 21:12:45 UTC
Subject: Re:  spamd configuration from config file

 From quinlan@pathname.com  2004-11-30 20:59 -------
> I'd prefer to see integration of options between SpamAssassin internals and
> spamd.  Having two formats and locations doesn't make sense if we can avoid
> it.

I just used the format in the patch since you use the exact same format 
in the config file as you would on the command line (or in an 
/etc/sysconfig/spamassassin file, as included with some/many distros).

It could be implemented without the dependency by just pushing 
non-comment lines from the config file on the @ARGV array.


Daryl

Comment 23 Malte S. Stretz 2004-12-01 11:04:21 UTC
When I thought last about this problem, I had the idea that we could just use 
the same config files as always, but write a pseudo-plugin for spamd, so you 
could put something like that into your /etc/spamassassin/local.cf: 
 
ifplugin Daemon 
  socket_path /var/run/spamd.sock 
endplugin 
 
Comment 24 Justin Mason 2004-12-01 11:28:34 UTC
Subject: Re:  spamd configuration from config file 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I'd prefer not to use the @file and -k semantics, and to avoid the
Getopt::ArgvFile dependency.

Michael's suggestion:

  Provide a '-f configfile' option to specify the config file on the
  command line.

gets my +1.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBrhtUMJF5cimLx9ARAqjlAKCNrlRTZycrfhL4xNPHzMro72MtuQCfbfyD
8A2B9gCfT6rCayFCLLewy2s=
=2Stu
-----END PGP SIGNATURE-----

Comment 25 Justin Mason 2004-12-01 11:30:07 UTC
Subject: Re:  spamd configuration from config file 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


'I just used the format in the patch since you use the exact same format 
in the config file as you would on the command line (or in an 
/etc/sysconfig/spamassassin file, as included with some/many distros).'

BTW, I strongly agree with this.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBrhu1MJF5cimLx9ARArGxAKCciwggADiUfLiJR9ENlIYDr8awFgCeM5Be
e4RhALYHp9+493a6v29Yzq0=
=CRi3
-----END PGP SIGNATURE-----

Comment 26 Daniel Quinlan 2005-03-30 01:08:45 UTC
move bug to Future milestone (previously set to Future -- I hope)
Comment 27 Fred T 2007-01-05 15:01:54 UTC
Can this possibly get added into 3.2 or 3.3?
Thanks!!!
Comment 28 Justin Mason 2007-01-15 10:16:20 UTC
I'm still not a fan of the patch (due to the additional module requirement)
Comment 29 Theo Van Dinter 2007-01-15 12:32:02 UTC
yeah, I don't like the patch either ...  it feels like something we could just
reuse M::SA::Conf and Conf::Parser for.  Do key/value, etc.