SA Bugzilla – Bug 4540
UUE in text body poisons bayes, patch for new TFLAG nobayeslearn
Last modified: 2019-06-24 16:12:02 UTC
Spamassassing incorrectly process mails with UUE-code in text body (see badmail.txt attachment for example): UUE-code analyzed as normal text and Bayes database learned with garbage. I wrote patch with new TFLAG "nobayeslearn" to disable autolearn Bayes-database if test passed. It helps in this cases.
Created attachment 3081 [details] Sample badly processed mail This is a sample mail with UUE-encoded attachment in text body. Usually this mails generated by robots or simple programs. Spamassassin tries to process UUE-body as simple text, so Bayes database trained with garbage
Created attachment 3082 [details] A patch which add new TFLAGS nobayeslearn This is a patch which resolves this problem. Additionally, new feature added: new TFLAGS nobayeslearn which tells Spamassassin not to train Bayes database if rule passed
I agree that this is a bad bug if bayes is learning garbage, but isn't this proposed solution is extremely bad? +# UUE-encoded attachment in text body +score UUE_IN_BODY -100 + Huh!?
yeah, giving negative scores for such structural details is a bad idea -- we tried it in 2.2x (?), and it was exploited by spammers. I'm not keen on adding this patch; garbage in the bayes DB will be expired quickly anyway.
Better to keep the database clean in the first place if possible, no?
Closing old stale bugs. Parser is quite a bit wiser these days.