SA Bugzilla – Bug 579
RFE: user configs should be read by spamc and sent to spamd
Last modified: 2010-01-27 03:16:21 UTC
For security and ease of configuration, I think it would be better if user configuration files were read by spamc and sent to spamd instead of being read by spamd as they are now. Has anyone raised this question yet? What is the general feeling?
I would be all for this idea, having spamc send the user_prefs file to spamd would be great. The current way of having spamd find the per-user config info assumes that the machine running spamd has easy access to user information, the user's home space, and that it can easily get to that home space, pick up the config read it. That's not the case on large sites where many different machines comprise the mail farm. Ideally I'd like to run spamd on a machine all by itself without needing nfs to get to home spaces or mysql, without it having to know anything about users, etc. If spamc just sent along the user_prefs file first, then spamd would be self-contained, I could use it from any machine on our network where users are getting mail without worrying about how I'm going to get those user prefs over to the spamd machine. This would be a great improvement. I would have made these changes already myself, but I'm reluctant to do local hacking as it prevents easy upgrades. :-( If I made these changes to spamd/spamc, and did it such that it was an option in the protocol and thus also backwards compatible, could it be included? Or would the people who wrote these be willing to do it? BTW, someone on the devel list wrote this: >Every change in the user preferences will need a new revision of the >spamc/spamd protocol. But I don't think that's the case. I would just have spamc send along the whole user_prefs file as is. Then let spamd parse it normally just as if it had gone out and picked it up from the user's home directory itself. The only thing we'd be changing is how spamd got the user_prefs file. Right now it goes out and gets it itself, but why not just send it along in the first place?
Discussed fairly extensively on the list. If someone implements it, feel free to add the patch here. Highly unlikely to get included in the standard distro because I think the architecture is just entirely wrong. But others might find your patch useful.
Craig -- I really disagree. I think there's no security implications here, as we read user_prefs files in a paranoid way anyway (rules cannot be defined from them etc.) Reopening for later arguing once 2.40 is released ;)
We have implemented this, as promised, and you should expect to see patches from us within a week or two.
I don't see it as a security thing Justin. I see it as an architectural issue. Clients just don't send servers config information, it's wack. The server should know when it need to get config info, and should have some way of getting it. If it's some weird callback-to-client thing, that fine. If it's fetch-config-by-http so that you don't have to install a few KB worth of database client libs on your mail server and prefer to install a few KB of HTTP client libs, then fine. But ultimately, anything like this should be implemented in the style of ConfSourceSQL.pm -- that way spamd can be smart about loading prefs, maybe cache them, maybe pass them over the network compressed or encrypted (easy option with some DBs), selectively fetch them, etc, etc. This whole notion of passing the config files just really seems badly wrong to me.
I'll add to this again. The more I thought about the current architecture of spamc/spamd, the more I think it is the current setup is wacky, and this change would make it much more normal as far as a client/server relationships go. Thinking about it as sending ‘config' info to the server is probably why it still sounds strange to some. But really, the user_prefs info is not telling spamd how to ‘configure' itself, but simply telling it what to do. If you fill out a form on a web page, you naturally send the contents of that form to the httpd server and tell it to do something with the information you send it. That is not configuration information. You don't expect the httpd server to connect back to you somehow and get all the information it needs. If you did expect that then the server would require all sorts of things like a way to connect to you (a server on your machine), authentication information, which user you are, where your information is kept, and a secure way to get at it. Isn't that nutty? :-) But that's what spamc/spamd does. Why not just tell the server what you want it to do in the first place? Why require all that extra overhead, IO, complications, and security risks? Seems to me the initial design was thought out only on a local scale, assuming that spamd would be running on the same machine as spamc, and the rest was an afterthought. For example, I have to specify every possible other IP I want to connect to spamd in the startup, with no wildcards or hostnames or anything like that possible. What if I have lots of machines whose IPs change? (and I have this problem too). It's also the current design that creates all the security risks. Changing as suggested here would remove them **all**. Once you ask the server to stop trying to get at user information, whether on the same machine or, (worse) on some other machine, all the security problems disappear. That is, since the server is not dependant on knowledge of a user base, it would not have to know anything about users or be in any way connected to them, no need to be root, no need to make config files readable to the spamd user (root or not), no chance of users reading or changing other user's settings, etc. Not to mention security risks involved in letting spamd connect back to you and retrieve arbitrary user information. That would make theses servers completely independent, allowing you to setup a farm of them to serve whomever you want without all the currently necessary mess of trying to figure out user config information, you could even out-source such servers since they are user-independent. And then we could also allow spamd to read user created checks from each user_conf, removing that annoying limitation as well. It's hard to believe that larger ISPs haven't already suggested this sort of architecture. Is this really the first ticket on this? or do they come and see this problem and immediately leave the idea behind for something that scales better and is more secure? I would like to use spamd/spamc instead of home-growing everything, but these problems are considerable drawbacks. Maybe I represent more people who didn't even bother to stick around contribute? There's also the argument of keeping spamc lightweight. To this I say that the change would be a couple lines of code, and the function could be invoked with a command line flag, (so as to not burden those who don't want this). That's not much more, it seems, considering the considerable benefits reaped. And it would actually decrease the total IO involved. spamc would just send along everything needed at once, removing all the IO overhead involved in asking the server to connect back to find the information that spamc could have sent along in the first place. Maybe I'm missing something here, but the evidence sure seems compelling to me. Isn't it time for this to expand its functionality for the times? please? :-)
Subject: Re: [SAdev] user configs should be read by spamc and sent to spamd > If you fill out a form on a web page, you naturally send the contents of that > form to the httpd server and tell it to do something with the information you > send it. That is not configuration information. You don't expect the httpd > server to connect back to you somehow and get all the information it needs. Bad analogy. With web servers you tend to use sessions and/or user authentication. The config is still all stored on the web server, not on the client end. I'm with Craig on this one. Matt.
The analogy is quite good in this case; spamd throws everything away which is not privileged. So it's not really a "config" anymore because it doesn't change spamd's core behaviour but only the way it displays its results. If you submit your webpage to validator.w3.org, the config is _not_ saved on the server. Having a server which polls the config from the client per some Conf:: handler is IMO braindead (don't take this personal, ok?). Let's assume you've got a company with ~100 workstations, most of those roadwarriors with their own laptop connected per VPN (I administer such a company). Employees are coming and going. Now you set up a company-"public" spamd server and everybody who wants may connect to this server. Shall I tell everybody "Hey, you've got to setup your own Apache/NFS/whatever if you want to customize the way SpamAssassin works. Oh and please don't forget to update it regularly and secure it very well when you're out there."? I'm with Joshua on this one :o)
Subject: Re: user configs should be read by spamc and sent to spamd There is no perfect analogy, but the idea is the same. It still makes no sense to make the spamd server connect back to the client for information. Call it whatever you want, it's still makes no sense and is a real limitation in the design, not to mention the other problems it causes as I mentioned before. How can I entice those on the other side of the fence to engage in discussion here? :-) I haven't been able to get anyone to dispute any of the arguments I've giving for fixing these design problems, and it would be nice to have something to tell my boss. :-> Any help appreciated! thanks.
I can see the point, too. IMO, user_prefs is *not* SA configuration -- it's more like browser configuration, it's just the required_hits threshold, maybe some whitelisted addrs. So it's not really configuration, just *user preferences*. I would think a good analogy is the font setting in your web browser -- this is a user preference. Or the validator.w3.org example. Or cookies -- contrary to what Matt posted, not all sites store user configs in a db, there's a few that let the browser store a few simple settings in the cookie. Basically, I don't think there's a need for spamd to require that all user prefs info is kept on the spamd server machine, or loaded from SQL or over NFS. *let* the client send it, if that fits into the sysadmin's network design more cleanly.
Subject: Re: [SAdev] user configs should be read by spamc and sent to spamd I'd be curious to know if these problems just plain don't exist if you use PPerl instead of spamc/spamd for persistence. Get PPerl from CPAN. Matt.
Created attachment 283 [details] This patch implements the proposed change.
Created attachment 284 [details] This is documentation for the patch.
The patch was written by Brian Marcotte of Panix. He has joined the mailing list in order to be available to answer questions. In addition to the brief documentation provided, he asked me to mention that the changes to the server are very small. (I count 12 lines.)
So, anyone?
So, can we get some movement on this? Is there a committer in the house?
Any reason for -j and -n?
The -n flag is for the server. This option tells the server to accept user preferences from the client. This flag isn't strictly necessary, but not including it would mean that spamd would always accept preferences from spamc. We thought some admins wouldn't like that. The -j flag is for the client. This option allows you to specify the path of the user preferences file you want sent to the server. I could have done it so that the option always sends ~/.spamassassin/user_prefs, but I figured that some people may want the option of using alternate user prefs files.
Sorry, if you meant why I chose the particular letters "n" and "j", well, "all the good ones were taken". The "n" was for "network" as in accept prefs from the network. Feel free to change them or suggest something else.
Subject: Re: [SAdev] user configs should be read by spamc and sent to spamd > Sorry, if you meant why I chose the particular letters "n" and "j", well, > "all the good ones were taken". The "n" was for "network" as in accept prefs > from the network. That was my question. I'd reccommend this be a --longopt only, as it doesn't seem (to me at least) that this'll be a heavily used option. Furthermore, that's why we implemented longopts.
For some unknown reason I decided to hack on this. I didn't really like the Panix patch, since it sends the entire prefs file over the network for every single email (among other reasons). So i set about writing code so that the prefs could be cached, and so spamd could reject the offer if the length and checksum match what it has cached. In the process, I realized there's not much difference between "read from spamc and store to X" and "store to X". And the proliferation of "read from Y" was getting on my nerves as well. So why not create an interface "read/write to X", so the only difference between "read/write to a directory", "read/write to ~/.spamassassin/user_prefs", "read/write to SQL", and so on is the backing store that's being used. So, i wrote ConfSourceGeneric.pm to define an interface for reading from an arbitrary source. I also wrote ConfStoreGeneric.pm to extend that interface for writing to the source. And i wrote a number of modules implementing these interfaces: ConfStoreDirectory.pm - Store user prefs in a direcrory, like the spamd -V/--virtual-config option. ConfStoreHomedir.pm - Store user prefs in the user's home directory, like the current default behavior. ConfStoreSQL.pm - Store user prefs in an SQL database. It improves on the current ConfSourceSQL by imposing an order on the directives (since no particular order is guaranteed by the SELECT), and by allowing saving to the DB as well as reading from it. ConfStoreVPopmail.pm - Much like the current spamd -v/--vpopmail option. I haven't been able to really test this one, but it's basically ConfStoreHomedir that uses vpopinfo instead of getpwent so it *should* work... ConfStoreMemory.pm - Store user prefs in memory. It even handles communication from child processes, so it should work with spamd. ConfStoreNull.pm - Dummy module, which doesn't actually store anything. ConfSourceSpamc.pm - Has a method to handle a simple protocol for spamc to send the user prefs to spamd, and then store these prefs into any of the ConfStoreGeneric subclasses. It implements the Source interface so these prefs can be read back easily. I also have a patch for libspamc that will support this. The idea is that spamd will recognize OFFER_PREFS like it does PROCESS or CHECK now, and hand the socket to this module to do the actual reading of the prefs (if necessary). The ConfSourceGeneric interface and the protocol are designed so that the prefs don't need to be sent across the wire every time. ConfStoreSimple.pm - Stores a single pref set at a time. Only really useful as the store for ConfSourceSpamc, when you want the prefs sent over the wire every time. ConfSourceAlt.pm - Reads prefs from the first of several possible sources. This way, we could do e.g. "look in $HOME first, then the virtual directory if that fails". ConfSourceCat.pm - Reads prefs from all of several possible sources. I've done enough unit testing on these that i doubt they have too many bugs left. I'm not sure what would be the best way to integrate these in with spamd or spamassassin, though, since most of the current config-reading methods aren't necessary anymore. Do we want to maintain backward compatibility, or do we want to eliminate the useless methods? Anyway, i'll attach the patches to this bug.
Created attachment 366 [details] Patch to the lib/ subdirectory
Created attachment 367 [details] Patches to spamc
Why don't you like sending the prefs every time? Performance? I suggest that you benchmark the different systems. My guess is that the extra round-trip across the network will consume the time you save by not sending the user_prefs over. You have to read the prefs file to compute the checksum, so you don't save I/O, and you use extra CPU. However, as long as the functionality that we need is implemented, I don't particularly care whether it's your patches or Brian's that are used. And it seems that you are cleaning up the code in general and making it more extensible, which is laudable.
Subject: Re: [SAdev] user configs should be read by spamc and sent to spamd On Tue, Oct 01, 2002 at 01:16:38PM -0700, br+spamassassin@panix.com commented: > > Why don't you like sending the prefs every time? Performance? Bandwidth mostly. If you use a persistant cache (e.g. ConfStoreDirectory, ConfStoreSQL) behind ConfSourceSpamc, this could be a cheapo way for users to update their prefs on the server without a login. Of course, for that to work we might want authentication of some sort in spamd... > You have to read the prefs file to compute the checksum, so you don't > save I/O, and you use extra CPU. Certainly not performance for spamc, considering i used the slower, non-table-driven crc implementation ;) > However, as long as the functionality that we need is implemented, I don't > particularly care whether it's your patches or Brian's that are used. And it > seems that you are cleaning up the code in general and making it more > extensible, which is laudable. I do try. This time, it all fell out of the problem of how to cache the prefs. In memory, on disk, why not abstract it so people can choose what they like?
Subject: Re: [SAdev] user configs should be read by spamc and sent to spamd FWIW, I've checked this code into the SA3 CVS because I generally like the idea here. However I'd also like to talk to the author about further re-designs of the whole conf system. The other thing is that I didn't really know how to actually integrate these changes, so I could do with a few pointers or possibly some coding help. Thanks. (PS: See the README.txt file in SA3 CVS for details on how I'm thinking the Conf structure should probably work).
This is apparently fixed in SA3. Closing LATER. As in it will be fixed later, when SA3 is released. FIXED/LATER? What's the dif?
Subject: Re: [SAdev] user configs should be read by spamc and sent to spamd > This is apparently fixed in SA3. Closing LATER. As in it will be fixed later, > when SA3 is released. FIXED/LATER? What's the dif? The diff is that it's not fixed in SA3 (yet?) but it might be later ;-)
reopening for 2.60, since 3.0 is far-off... I think I'll try integrating anomie's code.
taking bug
damn, don't think this will make it into 2.60. :(
This looks promising... at least as far as the Conf backend goes. Since SA3 is never going to happen, it might be worth discussing this.
Subject: Re: user configs should be read by spamc and sent to spamd > This looks promising... at least as far as the Conf backend goes. Since SA3 is > never going to happen, it might be worth discussing this. Would be great to re-open this. It's still a major pain to setup everything so that the spamd server on another machine has to be able to find user configs for each user on another machine, when the info could just be sent along with the initial connection in the first place. It would greatly simplify a spamd server setup.
lowering pri on RFEs
punting to 3.1
no reason given for reassigning to 3.1, no agreed-to plan for 3.1
move bug to Future milestone (previously set to Future -- I hope)
*** Bug 4262 has been marked as a duplicate of this bug. ***
Instead of blindly sending the contents of user_prefs to spamd, should spamc do some minimal processing beforehand? I think any very user specific stuff that spamc can do quickly should be done by spamc. For example, could spamc check the From: header against the whitelist and blacklist? If the address is in either, there's really no need to send the mail to spamd, since the user set these values. If the From is blacklisted, score the mail +100 and send it back, instead of sending it on to spamd. Equally for whitelisted address, score it -100 and send it back.
I think it's better to keep spamc as stupid as possible -- "fast and stupid" should be its motto. ;) we should be doing shortcircuiting like that in spamd, and in fact all the code is there to do early-exit -- it just hasn't been implemented yet...
I'm now doing this on large clusters with Perl milters passing config via M::SA::Client to spamd. It saves having to do additional SQL queries for config when the milters are already doing a config query for numerous filtering preferences.
I haven't looked at this in awhile but I wouldn't mind some sort of plugin hook in spamd that ran for headers. Then you could pass data in and let a plugin do whatever it wanted with the data. Would be trivial to add a user-config header that contained a frozen data structure or something like that then have the plugin unfreeze it and fold it into the config object.
I don't think the client should be doing any of the parsing (which rules out freeze/unfreeze)... it should just pass the config text as it would be stored in a user_prefs file or in SQL/etc. Maintaining both a Perl and C version of the config parser would be a pain/buggy, especially seeing few of us actually like playing with C. If the headers could be compressed like we can now compress the message, that'd be great.
I wasn't suggesting we offer anything that does the user config stuff in spamc. This would be strictly for folks using their own clients.
moving some 3.3.0-targeted bugs into the vague Future. feel free to retarget to 3.3.1 if you think you'll be able to work on them
reassigning, too