SA Bugzilla – Bug 5127
mimeheader rules cannot match newlines
Last modified: 2006-10-18 03:56:09 UTC
try adding these tests to t/mimeheader.t: mimeheader MATCH_NL Content-Type =~ /msword;\n\tname/ mimeheader MATCH_NL_RAW Content-Type:raw =~ /msword;\n\tname/ they'll never match, because the Message/Node.pm code modifies the header strings -- even for ":raw" -- to sanitise whitespace.
Created attachment 3714 [details] patch to add tests to mimeheader.t here's a patch that adds those 2 failing test cases to the 't' script, in TDD style ;)
Created attachment 3715 [details] fix draft well, that seemed easy! Theo, what have I broken? ;)
(In reply to comment #2) > Created an attachment (id=3715) [edit] > fix draft > > well, that seemed easy! Theo, what have I broken? ;) :) Well, I would have to think and prod about in the code for a bit to answer that. Just remember that the header and raw_header functions are for *all* headers, both the message headers and the MIME part headers. In a quick run through, I don't like the Message.pm changes -- the leading and trailing whitespace is generally irrelevant (the type and amount is ignorable, according to the RFC). As for raw() including all whitespace... I'm not necessarily opposed, though header vs raw_header was modeled after body vs rawbody, and it's all about encoding. Also, your Message::Node change breaks the current 3.x described API: "Will have a newline at the end and will be unfolded."
'As for raw() including all whitespace... I'm not necessarily opposed, though header vs raw_header was modeled after body vs rawbody, and it's all about encoding.' Well, in the body-vs-rawbody case, there's also 'full', which offers the pristine string. I suppose I could use that here, but it really would be quite messy, and in my opinion this behaviour (ie. not allowing matching against the folding whitespace) is counter-intuitive and a bug anyway! (by the way, this is cropping up because a certain newline-whitespace sequence in a folded MIME part header is a spam sign, and I want to match it. I think this is likely to be quite a common situation.) 'Also, your Message::Node change breaks the current 3.x described API: "Will have a newline at the end and will be unfolded."' Argh. OK, I've got a new patch that fixes that, by going direct to the raw_header() API instead from MIMEHeader.pm. I've also decided that you're generally right ;) It only preserves this whitespace for the raw_header hash; the header(), get_header() and {header} data is all back to what it was in SVN trunk, with folding newline/whitespace sanitized.
Created attachment 3716 [details] fix r2 a new rev. BTW, note that the s/// statements have simply moved from Message.pm to Node.pm; that's the only place that data is used, since the Message.pm code immediately calls the Node.pm code.
Created attachment 3717 [details] fix r3 oops! found a bug in that last one; the decoding algorithm wasn't coming out as trunk-compatible. this one is...
Related to the leading whitespace in :raw headers, I opened bug 5116 after Alex Broens asked jm and I about matching whitespace in Date headers as often seen in spam.
ok, taking silence as assent, I've checked this in (with a fix in the test script and a line or two of POD documentation). : jm 292...; svn commit -m "bug 5127: allow mimeheader :raw rules to match newlines and folded-header whitespace in MIME header strings" lib/Mail/SpamAssassin/Message.pm lib/Mail/SpamAssassin/Message/Node.pm lib/Mail/SpamAssassin/Plugin/MIMEHeader.pm t/mimeheader.t Sending lib/Mail/SpamAssassin/Message/Node.pm Sending lib/Mail/SpamAssassin/Message.pm Sending lib/Mail/SpamAssassin/Plugin/MIMEHeader.pm Sending t/mimeheader.t Transmitting file data .... Committed revision 465206.