SA Bugzilla – Bug 6319
bayes does not tokenize the from name
Last modified: 2022-04-22 06:46:26 UTC
Bayes doesn't tokenize the name part of the from header, e.g.: $ cat /tmp/dummy From: v1agra hyehdt <foo@example.com> Subject: meds gjguhdo test krhsye $ sa-learn --spam /tmp/dummy $ spamassassin -D bayes < /tmp/dummy 2>&1 1>/dev/null | grep -Ei "token.*=>" [5478] dbg: bayes: token 'meds' => 0.999854151320635 [5478] dbg: bayes: token 'H*F:U*foo' => 0.993172413793104 [5478] dbg: bayes: token 'H*F:D*example.com' => 0.993172413793104 [5478] dbg: bayes: token 'H*Ad:D*example.com' => 0.993172413793104 [5478] dbg: bayes: token 'test' => 0.011685356810132 [5478] dbg: bayes: token 'krhsye' => 0.986543689320388 [5478] dbg: bayes: token 'gjguhdo' => 0.986543689320388
Interesting. This is of note to bug 6315 and I have made this a blocker for that bug. Also note that the subject is read in as if a part of the body. I was under a different impression: I thought we did either two tokens (one as if in the body and one as a subject-specific token) or as just a subject-specific token. This would be a separate bug.
Name is now tokenized along with other fixes Sending trunk/lib/Mail/SpamAssassin/Plugin/Bayes.pm Transmitting file data .done Committing transaction... Committed revision 1900138.