Bug 7904 - Missing types in maybe_body_only()
Summary: Missing types in maybe_body_only()
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Learner (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
Depends on:
Reported: 2021-05-10 12:49 UTC by Bert Van de Poel
Modified: 2022-04-03 23:04 UTC (History)
2 users (show)

Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Bert Van de Poel 2021-05-10 12:49:43 UTC
I recently noticed some problems with Spamassassin not autolearning spam that was not overly similar to previous messages and had enough scores. From debug logs, I figured out the messages weren't hitting the minimum requirement of 3 or more spam score based on the body. Bizarrely enough, when I went through the list of tests, I did find enough examples of body tests in some cases. 

After reaching out to the mailing list about this, wondering whether this was perhaps a bug, I received confirmation from RW that some important parts were missing in the maybe_body_only() function which causes it not to consider very relevant body tests such as DCC, Razor and Pyzor (which give huge scores and we consider as very relevant). See the quote from https://www.mail-archive.com/users@spamassassin.apache.org/msg108260.html below:

"One thing that does look wrong is that maybe_body_only() looks

(($type == $TYPE_BODY_TESTS) || ($type == $TYPE_BODY_EVALS)
    || ($type == $TYPE_URI_TESTS) || ($type == $TYPE_URI_EVALS))

so it's missing any rawbody and full rules. 

Specifically Pyzor, Razor2 and DCC are full eval rules."

I would therefore consider this a bug that should be fixed and perhaps even considered by distros for backporting into existing long term releases (such as the Ubuntu LTS and RHEL).
Comment 1 Henrik Krohns 2022-04-03 09:03:17 UTC
Rawbody already added in Bug 7905 and co.

As we now have autolearn_header and autolearn_body from Bug 7907, they can simply be used to force any rule (including full) to be counted for specific points. This is probably best solution, since we can't really know what specific "full" is checking, so it should only be manually enabled.

I've added autolearn_body to dcc/pyzor/razor rules:

Sending        trunk/rules/25_dcc.cf
Sending        trunk/rules/25_pyzor.cf
Sending        trunk/rules/25_razor2.cf
Transmitting file data ...done
Committing transaction...
Committed revision 1899529.
Comment 2 Bert Van de Poel 2022-04-03 23:04:21 UTC
Thanks a ton for looking at these issues. Since SA has such a huge pile of bug reports that never seem to get triaged or closed, I had sort of given up on these issues. Super happy you had a look at them. I really appreciate it. Too bad we will probably have to wait quite a while for them to land on Debian and Ubuntu LTS releases, but still really looking forward to it!