SA Bugzilla – Bug 1544
Subject header rewritten incorrectly
Last modified: 2003-03-13 10:34:37 UTC
With Subject header rewriting enabled, an incomming spam with the subject header in all caps, eg: SUBJECT: blah blah When rewritten, results in a new subject header being added, eg: Subject: ***SPAM*** blah blah but the original subject header is not removed, so two (different) subject headers remain in the email causing undefined display behaviour in email clients. According to Theo Van Dinter this is only occuring with report_safe 0.
Created attachment 673 [details] suggested patch
Subject: Re: [SAdev] New: Subject header rewritten incorrectly On Sun, Feb 23, 2003 at 04:30:07PM -0800, bugzilla-daemon@hughes-family.org wrote: > According to Theo Van Dinter this is only occuring with report_safe 0. Yeah, we don't rewrite the subject with 'report_safe 1' ... :) Anyway, the bug is because NoMailAudit::replace_header() only deletes the header case sensitive as passed to it (ie: Subject), and not all headers case insensitive that are found. I have a suggested patch done up, but due to issues with the bugzilla machine I can't actually send it in yet... :(
Subject: Re: [SAdev] Subject header rewritten incorrectly On Sun, Feb 23, 2003 at 06:32:34PM -0800, bugzilla-daemon@hughes-family.org wrote: > Yeah, we don't rewrite the subject with 'report_safe 1' ... :) err... replace, not rewrite. (we do rewrite the subject, but report_safe has a new header section so there's no replacing going on...)
I've applied this to 2.60 already. sending to malte for 2.50
ISSUE: Do we have an example message that triggers this bug? I am unable to reproduce it. Also, should this: if ( defined $header_name_only ) { be written as: if (defined $header_name_only && $header_name_only) { <grumble>Can you not put spaces after the parens? It's that way everywhere else in the tree (well, 98% of the time).</grumble> You can also use a substr test with index instead of the !/^\Q$dhdr\E:/ thing. When occuring in common code (more common than here), it can be a real boost and it's a bit less spooky.
ISSUE: this fix puts the Subject at the bottom of the message which seems bad to me. If report_safe is turned off, then I think people are aiming for minimal modification of the message. Is it possible to rewrite the header in place even if the Subject case changes? Ignore my previous about reproducing it. I had rewrite_subject set to 0. Duh.
Subject: Re: [SAdev] Subject header rewritten incorrectly On Tue, Feb 25, 2003 at 05:49:04AM -0800, bugzilla-daemon@hughes-family.org wrote: > ISSUE: this fix puts the Subject at the bottom of the message which seems > bad to me. If report_safe is turned off, then I think people are aiming for > minimal modification of the message. Is it possible to rewrite the header > in place even if the Subject case changes? Well, older versions have the same behavior (new header at the end), so as far as the bug fix is concerned, there's no change. We just make sure to get rid of the old header. As for modification of the message, I'm not sure what the issue is with moving the Subject around. Sure, it means the header is modified, but 1) if this is an issue use report_safe, and 2) there's no specification that says the subject can't be the last line -- as long as it's valid the end user shouldn't notice. For 2.60, we should be able to record the location in header_order before we delete the original one. We can then move it around after the header is actually replaced. But that's an enhancement request. ;)
Created attachment 678 [details] sample message
Subject: Re: [SAdev] Subject header rewritten incorrectly On Tue, Feb 25, 2003 at 05:43:06AM -0800, bugzilla-daemon@hughes-family.org wrote: > ISSUE: Do we have an example message that triggers this bug? I am unable to > reproduce it. I just took a generic piece of spam and made the Subject header name uppercase. I'll attach the one I've been using. > Also, should this: > > if ( defined $header_name_only ) { > > be written as: > > if (defined $header_name_only && $header_name_only) { It doesn't really matter for 2.5x (nothing else calls the sub with that parameter, so just defined is enough), but the change should be made for completeness. I've already added it to 2.60. > <grumble>Can you not put spaces after the parens? It's that way everywhere > else in the tree (well, 98% of the time).</grumble> Do you mean inside the parens? "after the parens" is in the rest of the code too. Either way, why does this matter, as long as we don't start doing: if ( some_crap() ) { } eek! When you have multiple developers working on the same code, you tend to see their individual styles come out. I'd personally like to put all the code through "perltidy -i=2". Clean everything up in one shot. :) > You can also use a substr test with index instead of the !/^\Q$dhdr\E:/ thing. > When occuring in common code (more common than here), it can be a real boost > and it's a bit less spooky. Well, index is inefficient for this, but I don't know, I find a substr version more spooky than the RE: @{$self->{header_order}} = grep($dhdr eq substr($_,0,length $dhdr), @{$self->{header_order}}); Feel free to change the code if you feel it's a big enough issue. :)
Please use my correct mail addres
Subject: Re: [SAdev] Subject header rewritten incorrectly felicity@kluge.net writes: > Well, older versions have the same behavior (new header at the end), > so as far as the bug fix is concerned, there's no change. We just make > sure to get rid of the old header. Ah, okay. Consider that issue resolved. > As for modification of the message, I'm not sure what the issue is with > moving the Subject around. Sure, it means the header is modified, but > 1) if this is an issue use report_safe, and 2) there's no specification > that says the subject can't be the last line -- as long as it's valid > the end user shouldn't notice. RFC 2822 says headers "SHOULD NOT be reordered" although it's not an absolute requirement like almost everything else in that RFC. I don't remember anything that would prohibit fixing the case of a header name, though. It is important to note that the header fields are not guaranteed to be in a particular order. They may appear in any order, and they have been known to be reordered occasionally when transported over the Internet. However, for the purposes of this standard, header fields SHOULD NOT be reordered when a message is transported or transformed. More importantly, the trace header fields and resent header fields MUST NOT be reordered, and SHOULD be kept in blocks prepended to the message. See sections 3.6.6 and 3.6.7 for more information. We should fix 2.60 to follow the RFC.
Subject: Re: [SAdev] Subject header rewritten incorrectly On Tue, Feb 25, 2003 at 07:06:14AM -0800, bugzilla-daemon@hughes-family.org wrote: > AssignedTo|mss@msquadrat.de |spamassassin- > | |contrib@msquadrat.de > > > > ------- Additional Comments From spamassassin-contrib@msquadrat.de 2003-02-25 07:06 ------- > Please use my correct mail addres sorry, I can never remember your address, and a grep of the spamassassin list has a number of different addresses, so I just picked one.
Subject: Re: [SAdev] Subject header rewritten incorrectly > Do you mean inside the parens? "after the parens" is in the rest of > the code too. Either way, why does this matter, as long as we don't > start doing: > > if ( some_crap() ) > { > > } > > eek! Yes, I meant inside the parens just your example there that and in your proposed patch. Inside the parens is only used about 4% of the time in our code (it has grown). > When you have multiple developers working on the same code, you tend to > see their individual styles come out. I'd personally like to put all > the code through "perltidy -i=2". Clean everything up in one shot. :) Is perltidy really capable of sanely indenting complicated perl code? I use Emacs to do my editing and I often have to nudge it to get good formatting. > Well, index is inefficient for this, but I don't know, I find a substr > version more spooky than the RE: Sorry, rindex: @{$self->{header_order}} = grep(!rindex($_,$dhdr,0), @{$self->{header_order}});
Subject: Re: Subject header rewritten incorrectly Well, if you're already arguing about style: On Tuesday 25 February 2003 19:38 CET Dan wrote: >[...] > Sorry, rindex: > > @{$self->{header_order}} = grep(!rindex($_,$dhdr,0), @{$self->{header_order}}); Better: @{$self->{header_order}} = grep { !rindex ($_, $dhdr) } @{$self->{header_order}}; The block expression makes clear that the rindex is applied on the grepped array. The space after the function name seems to be the style used in SA (I don't like it though). And spaces after the commata between the arguments increase readability a lot :o) Oh, and the zero for the last argument doesn't make any sense because rindex will look for $dhdr in $_ at or *before* position zero. Cheers, Malte
Subject: Re: [SAdev] Subject header rewritten incorrectly On Tue, Feb 25, 2003 at 10:38:50AM -0800, bugzilla-daemon@hughes-family.org wrote: > Yes, I meant inside the parens just your example there that and in your > proposed patch. Inside the parens is only used about 4% of the time in > our code (it has grown). heh heh. my sphere of influence is growing! ;) but seriously ... I don't think it really matters if there are spaces or not. It's not a large enough difference so people have a problem with the different code, and I've been coding with spaces within the parens for so many years I don't think about it anymore. My fingers just knock off the "if ( " by impulse. > Is perltidy really capable of sanely indenting complicated perl code? I > use Emacs to do my editing and I often have to nudge it to get good > formatting. So far it's worked reliably for me. I don't actually configure it beyond "-i=2" (indent 2 spaces) though. > Sorry, rindex: > > @{$self->{header_order}} = grep(!rindex($_,$dhdr,0), @{$self->{header_order}}); You know, I just noticed we're both wrong: @{$self->{header_order}} = grep($_ ne $dhdr, @{$self->{header_order}}); wow, that was simple. ;)
Subject: Re: [SAdev] Subject header rewritten incorrectly On Tue, Feb 25, 2003 at 11:05:11AM -0800, bugzilla-daemon@hughes-family.org wrote: > You know, I just noticed we're both wrong: > > @{$self->{header_order}} = grep($_ ne $dhdr, @{$self->{header_order}}); huh. you know, I just noticed that I was wrong ... ;) I thought header_order was just a list of headers in the order they should come out in, I didn't realize they had a count on them too. Well, of the suggestions listed then, index is better than rindex, and I'd search for "$dhdr:".
Subject: Re: [SAdev] Subject header rewritten incorrectly Malte writes: > The block expression makes clear that the rindex is applied on the > grepped array. The space after the function name seems to be the style > used in SA (I don't like it though). And spaces after the commata > between the arguments increase readability a lot :o) I don't like it either. I think the consensus is much less clear about spaces after function names. I think that might be Justin's style. I'm sure I have a quirk or two myself, but nobody has complained yet. :-) > Oh, and the zero for the last argument doesn't make any sense because > rindex will look for $dhdr in $_ at or *before* position zero. Actually, I did it that way on purpose. I *only* wanted to look at offset 0. Nothing is less than 0, so it reproduces exactly the expression in Theo's patch: !/^\Q$dhdr\E:/ but, it's faster. Your version allows it to look at any offset, so the "^" part of the RE is lost. Theo writes: > You know, I just noticed we're both wrong: > > @{$self->{header_order}} = grep($_ ne $dhdr, @{$self->{header_order}}); > > wow, that was simple. ;) That's not quite what your original code did, though. There was no $ in your RE. If there was a "$", then "ne" is clearly the way to go. :-) Incidentally, I left out a ":" in the rindex expression, though, so it should have read (also changed to != 0 since I think that's clearer) like this: @{$self->{header_order}} = grep(rindex($_,"$dhdr:",0) != 0, @{$self->{header_order}}); Daniel
Created attachment 681 [details] latest suggested patch to fix this
ISSUE: looks good except you want: rindex($_,"$dhdr:",0) != 0 instead of index($_,"$dhdr:") != 0 Otherwise, a $dhdr of "Subject" will match "Foo-Subject".
Oops, correction, with the last patch, a $dhdr of "Subject" will match "Foo-Subject", but the result won't be zero so it WILL work, it is just inefficient as Theo correctly stated earlier (because it will search the entire string). The rindex form does, however, solve Theo's concern about the inefficiency of index, though. OKAY: will actually work (if theoretically slower)
Okay, now I'm going to seem indecisive. ISSUE: doesn't work because it will remove "later in the header" matches. The rindex form is actually necessary.
Subject: Re: [SAdev] Subject header rewritten incorrectly On Tue, Feb 25, 2003 at 01:34:51PM -0800, bugzilla-daemon@hughes-family.org wrote: > Okay, now I'm going to seem indecisive. Hehe. You'd think this would be simple being a simple line of code. How many times have we changed it? ;) > ISSUE: doesn't work because it will remove "later in the header" matches. index() should work, but it's a little less efficient than the rindex() with position 0 (index() is actually better if you don't put the position in there). "Foo-Subject:" won't be caught. :) > The rindex form is actually necessary. To hopefully stop the discussion, I've put in the rindex() with position 0 into HEAD. I'll update the patch again, and hopefully that'll be it. :) Gah!
Created attachment 682 [details] ok, new version of the patch with rindex() and position 0 instead of index()
OKAY: no comment :-)
commited to 2.5 branch. closing bug.