Bug 6584 - RTF parser for rendered body
Summary: RTF parser for rendered body
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P2 enhancement
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
Depends on:
Reported: 2011-05-09 06:57 UTC by Henrik Krohns
Modified: 2011-05-09 06:57 UTC (History)
0 users

Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Henrik Krohns 2011-05-09 06:57:16 UTC
While doing bug 6582, I found many messages with text/rtf text/richtext attachments. These are (were) popular also in spams.

Such RTF usually include _lots_ of formatting and could have large images or such embedded. It serves no purpose to scan that redundant data in rendered body array (we can use rawbody if desired).

Basic tag stripping could be done even with few regexes, there are lots of examples and CPAN modules around.