SA Bugzilla – Bug 3118
bayes: error in our calculations
Last modified: 2004-03-23 10:58:11 UTC
Henry reports: (15:03:50) HenryCStern: but there is effectively an off-by-one error in your naive Bayes calculations (15:04:09) HenryCStern: if you derive everything using Bayes' theorem, you'll see what I mean (15:05:47) justinmason23: so we should multiply both $H and $S by ($nham / $totalmsgcount) and ($nspam / $totalmsgcount) respectively to get correct figures? (15:05:57) HenryCStern: yeah he brought it up before, but we all seem to have forgotten it. so let's file a bug this time! ;)
testing this now.
see bug 2129 for results; basically, a win, so it's in.