Bug 5793 - ruleQA app needs to display "reuse" status of logs/net tests
Summary: ruleQA app needs to display "reuse" status of logs/net tests
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: RuleQA (show other bugs)
Version: 3.2.4
Hardware: Other other
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
Depends on:
Reported: 2008-01-18 08:36 UTC by Justin Mason
Modified: 2008-03-12 08:03 UTC (History)
0 users

Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Justin Mason 2008-01-18 08:36:08 UTC
we have this situation where some log lines may contain data from reused network
tests ("reuse=yes" in the log metadata).   this is pretty important info, so the
rule-QA app should expose it somehow.
Comment 1 Justin Mason 2008-02-13 07:09:40 UTC
ran into this again.  

is a good demo of the issue:

           SPAM%     HAM%
0.00000  22.2189   0.1386   0.994    0.87    0.00  URIBL_RHS_DOB   
0.00000   0.0000   0.0000   0.500    0.60    0.00  URIBL_RHS_DOB bb-fredt  
0.00000  36.7496   0.0550   0.999    0.92    0.00  URIBL_RHS_DOB bb-jm  
0.00000   0.0000   0.0000   0.500    0.40    0.00  URIBL_RHS_DOB bb-zmi  
0.00000   0.0000   0.2801   0.000    0.39    0.00  URIBL_RHS_DOB jm

reasons why the SPAM% column has 0 hits:

- "bb-fredt": there's no spam in that collection (we should probably have some
  way to indicate this?)

- "bb-zmi": the mails are scanned with network rules enabled, and other net
  rules are reusable, but URIBL_RHS_DOB is not one of them.  this seems to be
  because it's scanning ancient spam from 2005! (doh. I'll fix this bug)

- "jm": the mail scanned in this collection is from the spamtrap, so has no
  network rule hits marked.

I think we need to indicate 3 things here:

- the "reuse" data was usable -- the "bb-jm" case.  This should maybe just be
displayed as it currently is.

- there were no spam mails -- "bb-fredt".  Maybe grey this out further?

- there was no reuse data -- "jm".  Some other UI cue that this is the case,
another colour maybe?

- (perhaps) there was a mix of reusable and nonreusable data.  some colour
mid-way between the two?
Comment 2 Justin Mason 2008-03-12 08:03:14 UTC
OK, I've been thinking about this a little.  I'm considering icons in the freqs
table to be the way to do it.  The "silk" icon set is quite nice, and under a
CC-by license:


So for each rule, we'd have a single status icon:

  no hits: cancel  ("x")
  low hits: weather_rain
  bad S/O: weather_clouds
  meta subrule: brick  (lego brick, used to build other rules)
  good rule:  weather_sun  or rosette  or  tick

And then in the HAM%, SPAM% columns:

  no mails in spam/ham corpus: exclamation (for that corpus)
  reuse: control_repeat_blue
  noreuse: control_repeat
  partial-reuse: both icons together

I was also considering rendering some kind of pie-chart for each HAM%/SPAM%
line to indicate how much of the overall test corpus was made up of mail
from that user's collection.

I'm not sure that these would be viable on the "all rules" table, since there
are over 1000 rows in the table!  but on the "detail" pages, it would work fine
I think -- and it's even possible to get a "detail" view of all rules with a
sufficiently abusive regexp search.

Another good way to display error conditions is used on
http://www.planetpython.org/ .  If you notice the right hand side list of
blogs, some are underlined with red dashes -- hover the mouse over them and
you'll get a tooltip with the error message.  If you don't hover the mouse,
the error message is hidden.