SA Bugzilla – Bug 3816
[review] Add an X-header to know when added a Subject header
Last modified: 2004-11-16 07:50:53 UTC
This comes from thinking about bug 3605. The objection to adding a Subject header when none exists seems to be that there is no way to tell that the header did not previously exist; thus there is no way to safely strip the added Subject header back out when recreating the pristine mail. The obvious solution would seem to be to know when the header was added when it did not previously exist. I can see two potential fairly trivial and completely reversable ways to do this. 1. Add an X-Spam-Added-Subject-Header: Yes 2. Since there was no Subject header, there was no subject header text. Create a Subject header that includes unique text such as "Subject Header added by SpamAssassin", in addition to whatever the 'rewrite' rules require for the Subject header. With either method, SA can now reasonably unequivocably determine that it added a subject header where none had previously existed, and remove it when rebuilding the pristene message. (For the objection that either of these headers can be faked in the original mail, and the result would then not be identical to the original mail, I submit that X-Spam-Status and the like get stripped from the original mail anyway, and can not be put back when rebuilding the pristine mail. Thus, even now it is impossible to accurately rebuild the original message if it contains certian SA- generated text/headers. This would be in no way different.)
Subject: Re: New: Add an X-header to know when added a Subject header I'm not really in favor of adding a new X-Spam header. I'd rather just add a header in a way we could recognize it as an empty header, like perhaps the difference between: missing Subject -> "Subject: *****SPAM*****-(no subject)" "Subject: foo" -> "Subject: *****SPAM***** foo"
Subject: Re: Add an X-header to know when added a Subject header > I'm not really in favor of adding a new X-Spam header. I'd rather just > add a header in a way we could recognize it as an empty header, like > perhaps the difference between: Fine by me. I mostly wanted to bring up the idea that there are several possible and simple ways around the problem of "how do we know we have to remove the subject line?" I do consider not adding the subject header when it wasn't there to be a moderately severe drawback for some people. Subject-less spam is becoming more common, and this is the traditional place to put spam tagging. If spammers can get around SA completely (for some significant amount of users) simply by leaving the subject off, I'm sure they will do that. > missing Subject -> "Subject: *****SPAM*****-(no subject)" > "Subject: foo" -> "Subject: *****SPAM***** foo" Note that dash (which may have been accidental) in the first format would be required. I receive about 10 spams a day with a subject of "(no subject)". I would personally prefer that SA specifically stated in the text that it had added the subject. Loren
Subject: Re: Add an X-header to know when added a Subject header On Thu, Sep 23, 2004 at 09:52:42PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > more common, and this is the traditional place to put spam tagging. If > spammers can get around SA completely (for some significant amount of users) > simply by leaving the subject off, I'm sure they will do that. A correction: the spam isn't getting around SA at all. the spam is "getting around" the user's simplistic MUA filter. the problem here is 100% *not* a SpamAssassin bug (rewrite_header is doing exactly what it says it's going to do, it rewrites the header -- if the header doesn't exist, there's nothing to rewrite.) I'm not generally against the idea of adding the header, as long as remove_markup can remove it reliably, and we don't need to kluge in some way to make it work. We spent enough time 2.6->3.0 removing kluges that I'd rather not add new ones in.
Subject: Re: Add an X-header to know when added a Subject header Ha! It was intentional, but I should have mentioned that. :-) The '-' instead of ' ' makes it 100% clear that SA added the header. (The only way you'd see "Subject: *****SPAM***** (no subject)" would be if some spammer sent the spam with a tag already in it and hey, that's fine with me!)
Subject: Re: Add an X-header to know when added a Subject header Also, I definitely agree with Theo about "no" on kludges. I think my solution would work fine, you just remove with a slightly looser regular expression.
Subject: Re: Add an X-header to know when added a Subject header On Thu, Sep 23, 2004 at 10:20:19PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > Also, I definitely agree with Theo about "no" on kludges. I think my > solution would work fine, you just remove with a slightly looser regular > expression. In 3.0 it actually would be more difficult since we now take the original header and put it in X-Spam-Prev-header with the original one. Then we don't need to do RE and just move the values around. FYI
Subject: Re: Add an X-header to know when added a Subject header > A correction: the spam isn't getting around SA at all. the spam is "getting > around" the user's simplistic MUA filter. Correction accepted. Though as far as the overall food chain of spam removal is concerned it is a problem for many users. > the problem here is 100% *not* a > SpamAssassin bug (rewrite_header is doing exactly what it says it's going to > do, it rewrites the header -- if the header doesn't exist, there's nothing to > rewrite.) Never said it was an *SA* problem, I tagged it as an Enhancement. SA seems to me the most reasonable place to "fix" this overall mail-food-chain problem though.
Subject: Re: Add an X-header to know when added a Subject header > Also, I definitely agree with Theo about "no" on kludges. I think my > solution would work fine, you just remove with a slightly looser regular > expression. I'll agree with that. I'm generally against kludges myself, but in this case I question whether what we are talking about is a kludge. To be pedantic about the definition of "rewrite_header" would indeed mean that it is a kludge. But it seems clear from observation that what most *users* feel it should mean is "reliably install this markup so I can see it is spam". This clearly isn't the original developer's intent. But standing about 30 feet back on the fence, I can't say I consider the viewpoint to be at all unreasonable. As a maker of a number of commercial products myself, I can say with confidence that the user's viewpoint about how a tool should work or be used will not always agree with the developer's ideas. I've found over the years that it is usually better to at least permit, if not necessarily internally agree with, the user's interpretation of how things should be used. About the only exception is when the user's desires are in complete conflict with the internal architecture. I don't think that really pertains here; this is more a matter of semantics: how strictly or loosely do we interpret the meaning of the word 'rewrite". My personal belief *in this case* would be that we accept that it means "darnit, do whatever you need to do to install this here markup".
Subject: Re: Add an X-header to know when added a Subject header > In 3.0 it actually would be more difficult since we now take the original > header and put it in X-Spam-Prev-header with the original one. Then we don't > need to do RE and just move the values around. Eh? I'm not sure I followed that. Are you saying 3.0 makes an X-Spam-Prev-Subject: header with the original subject before installing header markup? If so, then all this "how to tag" stuff seems moot. There ain't gonna be an X-Spam-Prev-Subject header if there was no previous Subject. That seems like a pretty good clue about what the result should be after remove_markup.
Subject: Re: Add an X-header to know when added a Subject header On Thu, Sep 23, 2004 at 10:42:34PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > Eh? I'm not sure I followed that. Are you saying 3.0 makes an > X-Spam-Prev-Subject: header with the original subject before installing > header markup? Right. > If so, then all this "how to tag" stuff seems moot. There ain't gonna be an > X-Spam-Prev-Subject header if there was no previous Subject. That seems > like a pretty good clue about what the result should be after remove_markup. Well.. Not really. There doesn't need to be a Prev version, per backward compatibility. So the lack of the header doesn't tell you anything in and of itself. Quinlan's thing would work for now, but relies on the backward compatibility code to work (fine for now, but bad juju in general). Hence my kluge comment -- the requested behavior doesn't really fit in with the 3.0 way of doing things without doing something special just for this situation.
How about we say that restoring marked mail to its pristine state works in every case except the backward compatibility case of marking the Subject header and not adding X-Spam-Prev-Subject. If someone really wants to use the compatibility mode then they will have to live with creating an empty Subject header if they want to recreate pristine mail from one marked as spam that had no Subject. In terms of usability that is much better than giving a user some spam that does not have Subject: *****SPAM***** in it. The failure case this way is more rare and benign in its effects. If there is a X-Spam-Prev-Subject header it would be easy to give it a special value to indicate that there was no Subject header. Since it will not look like the tail of the Subject header it will be obvious that it did not come from the Subject header.
+1 on Sidney's suggestion
As a user who was about to report this "bug"... I gotta say that as far as the end user is concerned, just add the damn subject line so I can trash the emails by looking for ****SPAM**** :-) Getting back to the original email of either a blank Subject: line or no Subject: line at all seems rather picuyane to me. In either case, there is no useful subject, and in either case, it's almost-for-sure JUNK that should just be thrown away anyway. Does anybody really care whether it was no subject line at all versus a blank one? It's trash either way. Now all I gotta do is get my web host to upgrade SA and call it done. :-) Actually, I'll just modify my PHP IMAP script that does my custom filtering (post-SA) to catch this one and trash anything with no subject line, but I'll prod him to upgrade anyway, to benefit the other users. Bottom Line: Most mail clients don't have filters flexible enough to parse the raw spam score. Many webhosts allow end users to configure SA to rewrite the Subject based on the score, to allow the end user to decide what is an acceptable level of junk to wade through without losing real email, and giving that flexibility to the end user is, in general, a Good Thing (tm). Thus, the end user is relying on the behaviour that junk email gets the ****SPAM**** marker in the subject line, based on the SA score, in order to configure their filters to throw away the trash. PS I'm looking for a good de-biff function so I can detect intentional mis-spellings of junk words... Considering Soundex and similar functions, as well as simple substitution ciphers comparing to a dictionary of bad words, but open to other suggestions or refinements.
Isn't "Missing Subject" a rule that gets hit? If so, that should be enough to tell you that the current header was added by SA.
*** Bug 3605 has been marked as a duplicate of this bug. ***
Created attachment 2509 [details] implementation of fix OK, here's the fix, as per Sidney's comment that we all seemed to agree on at last ;) quick notes: - it uses a magic string, "(nonexistent)", in the X-Spam-Prev-Subject header to indicate that the header didn't exist previously - it will *not* remove the Subject header if the X-Spam-Prev-Subject contains "(nonexistent)" but the Subject does not; so avoiding whatever the forging issue was - it documents the behaviour in M:SpamAssassin:Conf - there's a t script to test it
+1
applied! r76069