Bug 4540 - UUE in text body poisons bayes, patch for new TFLAG nobayeslearn
Summary: UUE in text body poisons bayes, patch for new TFLAG nobayeslearn
Status: RESOLVED WONTFIX
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: 3.0.4
Hardware: All All
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-08-17 01:59 UTC by afk
Modified: 2019-06-24 16:12 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Sample badly processed mail text/mail None afk [NoCLA]
A patch which add new TFLAGS nobayeslearn patch None afk [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description afk 2005-08-17 01:59:55 UTC
Spamassassing incorrectly process mails with UUE-code in text body (see 
badmail.txt attachment for example): UUE-code analyzed as normal text and Bayes 
database learned with garbage.

I wrote patch with new TFLAG "nobayeslearn" to disable autolearn Bayes-database 
if test passed. It helps in this cases.
Comment 1 afk 2005-08-17 02:16:40 UTC
Created attachment 3081 [details]
Sample badly processed mail

This is a sample mail with UUE-encoded attachment in text body.
Usually this mails generated by robots or simple programs.
Spamassassin tries to process UUE-body as simple text, so Bayes database
trained with garbage
Comment 2 afk 2005-08-17 02:20:35 UTC
Created attachment 3082 [details]
A patch which add new TFLAGS nobayeslearn

This is a patch which resolves this problem. Additionally, new feature added:
new TFLAGS nobayeslearn which tells Spamassassin not to train Bayes database if
rule passed
Comment 3 Warren Togami 2005-08-18 02:31:08 UTC
I agree that this is a bad bug if bayes is learning garbage, but isn't this
proposed solution is extremely bad?

+# UUE-encoded attachment in text body
+score UUE_IN_BODY -100
+

Huh!?
Comment 4 Justin Mason 2005-08-18 09:41:45 UTC
yeah, giving negative scores for such structural details is a bad idea -- we
tried it in 2.2x (?), and it was exploited by spammers.

I'm not keen on adding this patch; garbage in the bayes DB will be expired
quickly anyway.
Comment 5 John Hardin 2006-10-12 08:07:15 UTC
Better to keep the database clean in the first place if possible, no?
Comment 6 Henrik Krohns 2019-06-24 16:12:02 UTC
Closing old stale bugs. Parser is quite a bit wiser these days.