SA Bugzilla – Bug 3970
bad sa-learn dump (encrypted token ?)
Last modified: 2004-11-16 00:17:58 UTC
with spamassassin 3.0.1, the "sa-learn --dump all" give me something like : 0.000 0 3 0 non-token data: bayes db version 0.000 0 77 0 non-token data: nspam 0.000 0 0 0 non-token data: nham 0.000 0 4646 0 non-token data: ntokens 0.000 0 1100538095 0 non-token data: oldest atime 0.000 0 1100593149 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count 0.500 14 0 1100591536 90311df836 0.500 2 0 1100538095 d58eb8cdb4 ... the token looks like encrypted and when I try to use the --regexp, it does not find anything with spamassassin 2.64, I have a "clear" token, and regexp search work : 0.000 0 2 0 non-token data: bayes db version 0.000 0 8058 0 non-token data: nspam 0.000 0 22695 0 non-token data: nham 0.000 0 158084 0 non-token data: ntokens 0.000 0 1099178165 0 non-token data: oldest atime 0.000 0 1100598197 0 non-token data: newest atime 0.000 0 1100597807 0 non-token data: last journal sync atime 0.000 0 1100560100 0 non-token data: last expiry atime 0.000 0 1382400 0 non-token data: last expire atime delta 0.000 0 1501 0 non-token data: last expire reduction count 0.885 3 1 1100597380 o'clock 0.947 7 1 1100597380 bernoulli 0.891 9 3 1100597380 anticipate ...
Yup. One of the changes in v3 is that the tokens are now based on sha1 hash values of the raw token value. It's mentioned in the UPGRADE document, and has been well discussed on the users list. We should probably remove --regexp as an option since it's no longer usable as originally implemented.
You can also remove the "--dump data" option too, because it is not longer useful ...
Subject: Re: bad sa-learn dump (encrypted token ?) On Tue, Nov 16, 2004 at 02:20:17AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote: > You can also remove the "--dump data" option too, because it is not longer > useful ... No, it's still very useful. There's a lot of information in knowing what's in your database, knowing what the actual tokens are isn't so important and in the face of the resource improvements it's not a bad tradeoff.
Ok, we can get general informations with "--dump magic", and probably have some statistics with the data. but the new option "--backup" can give us the same informations. Do you will keep this two options with (almost) same result ? I do not contest at all ressource improvement. it is just a pity to break control commands. Is there any plan to (re)add this functionnality (regex search) in a future release ?
Subject: Re: bad sa-learn dump (encrypted token ?) On Tue, Nov 16, 2004 at 08:59:34AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote: > > but the new option "--backup" can give us the same informations. Do you will > keep this two options with (almost) same result ? True, backup gives you most of the data, but it doesn't give you the bayes stats like --dump data does. This can be useful for folks who are interested in that sort of thing, see below. > I do not contest at all ressource improvement. it is just a pity to break > control commands. Is there any plan to (re)add this functionnality (regex > search) in a future release ? The conversion to binary, fixed length token keys was a HUGE win in performance. A large effort was made to add the option (for those who didn't care about performance) to store the raw token value in the database (sorry I don't have the bug number in front of me). The sum total of that work was that adding even the option of storing the raw token value was a performance hit and when posed to the user community there was very little call for this. However, as a sort of compromise for any future requests for this sort of data several hooks where added to the Plugin API that allow you to get at the raw token data. This allows you to create a plugin to fetch this data. The plan, in some future version is to expand the Plugin API to allow for something to happen in the dump (or better named) method. This is on the todo list for some future version of SA, so I'm closing this bug as invalid. Michael
Closing as invalid, this is by design.