Bug 6273 - Logging Enhancement Request
Summary: Logging Enhancement Request
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: 3.2.5
Hardware: Other All
: P5 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
: 6425 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-01-04 21:20 UTC by Dan Mahoney
Modified: 2010-04-28 20:30 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Mahoney 2010-01-04 21:20:07 UTC
Hello all:

(Note I've brought this up on the mailing list, and per a request from John Hardin am mostly reiterating it here, although I'm breaking it down to several possible solutions which may be better filed as individual enhancement requests).

The output of SA's logging is hard to parse from a syslog point of view.  Running in the most full debug mode, for a single message I get:

Jan  4 21:51:48 quark spamd[19770]: spamd: connection from prime.gushi.org [72.9.101.130] at port 55208

Jan  4 21:51:49 quark spamd[19770]: spamd: processing message <201001050246.o052kCHR084403@prime.gushi.org> for danial:58

Jan  4 21:51:50 quark spamd[19770]: FuzzyOcr: Processing Message with ID "<201001050246.o052kCHR084403@prime.gushi.org>" (VIAGRA (c) Best Supplier <rrn@daalfuzzia.net> -> rrn@daalfuzzia.net)

Jan  4 21:51:50 quark spamd[19770]: spamd: identified spam (25.8/5.0) for danial:58 in 1.3 seconds, 4239 bytes.

Jan  4 21:51:50 quark spamd[19770]: spamd: result: Y 25 - BAYES_99,DCC_CHECK,DIGEST_MULTIPLE,HTML_IMAGE_ONLY_32,HTML_IMAGE_RATIO_02,HTML_MESSAGE,KAM_BADIPHTTP,MIME_HTML_ONLY,NORMAL_HTTP_TO_IP,RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RCVD_IN_SORBS_WEB,RCVD_IN_XBL,RDNS_NONE,SARE_FROM_DRUGS,SARE_UNA,URIBL_AB_SURBL,URIBL_BLACK,URIBL_SBL scantime=1.3,size=4239,user=danial,uid=58,required_score=5.0,rhost=prime.gushi.org,raddr=72.9.101.130,rport=55208,mid=<201001050246.o052kCHR084403@prime.gushi.org>,bayes=1.000000,autolearn=spam,shortcircuit=no

The problems here are:

1) Logging is not tunable, at all.  You turn on debug mode, and get more, or less, but it's not possible to configure what goes into a given logfile line.

Think of apache's "LogFormat" command, and the flexibility it gives you: several predefined standards, plus the ability to use syslog for (or create your own logfiles) any other data, in any format.  As far as I know, spamd can't log to anything other than syslog.  It's already got SQL connectivity built in, but there's no logging to that either.

At the very least, moving the logging into a tokenized format I could specify in local.cf would be a MAJOR improvement.

Other useful things to have as logging tokens that don't show up in any current core logs:  Sendmail QueueID, From-Address, To-Address, Relaying IP (SpamAssassin knows and uses all these pieces of data for various rules).

2) Correlating the above lines is hard.  Line 4 has the decimal score, but no other lines do.  Line 4 also lacks ANY useful message id (it has a user and a size, neither of these are unique, it has the PID, which doesn't really help at all).

Only line 2 has the messageid, which is a chunk of data with no arbitary format, where the sender can supply it and we'll log it (i.e. it's forgable).  Furthermore, it is confusable with email addresses, and the <> and @ make it even harder to parse or use with a tool like grep, PLUS it appears in different places on every line.  It's probably the worst possible correlator to use.  

At no point does any instance of spamd log the sendmail "queue id" (which is all one needs to grep for given any other daemon or milter.  This is easily parsed out of any message.  In order to find that line in my mail logs, I would need to find the following logfile line (on a different machine).

Jan  4 21:46:16 <mail.info> prime sm-mta[84403]: o052kCHR084403: from=<rrn@daalfuzzia.net>, size=3144, class=0, nrcpts=1, msgid=<201001050246.o052kCHR084403@prime.gushi.org>, proto=ESMTP, daemon=MTA, relay=[203.140.73.194]

And from there, the queueid (o052kCHR084403) lets me search for every other action to occur to that message.  Spamd shouldn't be different.

At the very, very least (if the sendmail queueid isn't available) can we "make up" a single unique messageid for all logfiles being logged for a given message, and use that string in all loglines relating to that message (with sendmail, this is always the bit following the pid)?  At least that way, we can correlate matching spamd log lines, even if matching MTA lines is hard.

I guess what I'm saying here is that it's hard both to correlate spamd lines against other daemons, and against other spamd loglines acting on the same message.  Both of these should be a goal in logging anything.

So, to sum it up, here's the requests, as I mention them above.  Feel free to let me know if these are too lofty, or if I should file separate enhancement requests.

1) Configurability:

1a) Implement Flatfile Logging.

1b) Implement tokenized logging to flatfiles.

1b-a) Add tokens as above.

1c) If tokenizing isn't going to be feasible, can we please add sender, recipient, messageid, decimal score to the Result Line?  Thirdparty plugins (such as FuzzyOCR, above) manage this for their own logs, why doesn't core spamd?

1d) Implement logging to SQL (with or without tokens).

2) Correlation:

2a) Log actual sendmail queuid to all loglines in the same place.  Given we may not know this on initial connect for line 1 above.

2b) Failing 2a, make up a queueid early enough in the process based on (date, sourceport, srcip, dstport, pid) and log that everywhere you'd log the above.

Finally,

3) Standardize on what format third-party plugins should write to the log with.

Let me know if there's anything I can clarify.
Comment 1 Jeff Rice 2010-03-23 17:00:02 UTC
I agree - especially on the decimal score issue.  It's inconsistent to show the score as an integer in some places and a float in others, with no way of robustly linking the tests with the float score.  Please fix!
Comment 2 Dan Mahoney 2010-04-28 20:30:22 UTC
*** Bug 6425 has been marked as a duplicate of this bug. ***