Bug 3816 - [review] Add an X-header to know when added a Subject header
Summary: [review] Add an X-header to know when added a Subject header
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P3 enhancement
Target Milestone: 3.0.2
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
: 3605 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-09-23 19:34 UTC by Loren Wilton
Modified: 2004-11-16 07:50 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
implementation of fix patch None Justin Mason [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Loren Wilton 2004-09-23 19:34:18 UTC
This comes from thinking about bug 3605.

The objection to adding a Subject header when none exists seems to be that 
there is no way to tell that the header did not previously exist; thus there is 
no way to safely strip the added Subject header back out when recreating the 
pristine mail.

The obvious solution would seem to be to know when the header was added when it 
did not previously exist.

I can see two potential fairly trivial and completely reversable ways to do 
this.

1. Add an X-Spam-Added-Subject-Header: Yes

2. Since there was no Subject header, there was no subject header text.  Create 
a Subject header that includes unique text such as "Subject Header added by 
SpamAssassin", in addition to whatever the 'rewrite' rules require for the 
Subject header.

With either method, SA can now reasonably unequivocably determine that it added 
a subject header where none had previously existed, and remove it when 
rebuilding the pristene message.

(For the objection that either of these headers can be faked in the original 
mail, and the result would then not be identical to the original mail, I submit 
that X-Spam-Status and the like get stripped from the original mail anyway, and 
can not be put back when rebuilding the pristine mail.  Thus, even now it is 
impossible to accurately rebuild the original message if it contains certian SA-
generated text/headers.  This would be in no way different.)
Comment 1 Daniel Quinlan 2004-09-23 20:38:50 UTC
Subject: Re:  New: Add an X-header to know when added a Subject header

I'm not really in favor of adding a new X-Spam header.  I'd rather just
add a header in a way we could recognize it as an empty header, like
perhaps the difference between:

missing Subject  ->  "Subject: *****SPAM*****-(no subject)"
"Subject: foo"   ->  "Subject: *****SPAM***** foo"

Comment 2 Loren Wilton 2004-09-23 21:52:41 UTC
Subject: Re:  Add an X-header to know when added a Subject header

> I'm not really in favor of adding a new X-Spam header.  I'd rather just
> add a header in a way we could recognize it as an empty header, like
> perhaps the difference between:

Fine by me.  I mostly wanted to bring up the idea that there are several
possible and simple ways around the problem of "how do we know we have to
remove the subject line?"

I do consider not adding the subject header when it wasn't there to be a
moderately severe drawback for some people.  Subject-less spam is becoming
more common, and this is the traditional place to put spam tagging.  If
spammers can get around SA completely (for some significant amount of users)
simply by leaving the subject off, I'm sure they will do that.


 > missing Subject  ->  "Subject: *****SPAM*****-(no subject)"
> "Subject: foo"   ->  "Subject: *****SPAM***** foo"

Note that dash (which may have been accidental) in the first format would be
required.  I receive about 10 spams a day with a subject of "(no subject)".
I would personally prefer that SA specifically stated in the text that it
had added the subject.

        Loren

Comment 3 Theo Van Dinter 2004-09-23 22:06:23 UTC
Subject: Re:  Add an X-header to know when added a Subject header

On Thu, Sep 23, 2004 at 09:52:42PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> more common, and this is the traditional place to put spam tagging.  If
> spammers can get around SA completely (for some significant amount of users)
> simply by leaving the subject off, I'm sure they will do that.

A correction: the spam isn't getting around SA at all.  the spam is "getting
around" the user's simplistic MUA filter.  the problem here is 100% *not* a
SpamAssassin bug (rewrite_header is doing exactly what it says it's going to
do, it rewrites the header -- if the header doesn't exist, there's nothing to
rewrite.)

I'm not generally against the idea of adding the header, as long as
remove_markup can remove it reliably, and we don't need to kluge in some
way to make it work.  We spent enough time 2.6->3.0 removing kluges that
I'd rather not add new ones in.

Comment 4 Daniel Quinlan 2004-09-23 22:18:34 UTC
Subject: Re:  Add an X-header to know when added a Subject header

Ha!  It was intentional, but I should have mentioned that.  :-)

The '-' instead of ' ' makes it 100% clear that SA added the header.
(The only way you'd see "Subject: *****SPAM***** (no subject)" would be
if some spammer sent the spam with a tag already in it and hey, that's
fine with me!)

Comment 5 Daniel Quinlan 2004-09-23 22:20:18 UTC
Subject: Re:  Add an X-header to know when added a Subject header

Also, I definitely agree with Theo about "no" on kludges.  I think my
solution would work fine, you just remove with a slightly looser regular
expression.

Comment 6 Theo Van Dinter 2004-09-23 22:28:04 UTC
Subject: Re:  Add an X-header to know when added a Subject header

On Thu, Sep 23, 2004 at 10:20:19PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Also, I definitely agree with Theo about "no" on kludges.  I think my
> solution would work fine, you just remove with a slightly looser regular
> expression.

In 3.0 it actually would be more difficult since we now take the original
header and put it in X-Spam-Prev-header with the original one.  Then we don't
need to do RE and just move the values around.

FYI

Comment 7 Loren Wilton 2004-09-23 22:29:14 UTC
Subject: Re:  Add an X-header to know when added a Subject header

> A correction: the spam isn't getting around SA at all.  the spam is
"getting
> around" the user's simplistic MUA filter.

Correction accepted.

Though as far as the overall food chain of spam removal is concerned it is a
problem for many users.


> the problem here is 100% *not* a
> SpamAssassin bug (rewrite_header is doing exactly what it says it's going
to
> do, it rewrites the header -- if the header doesn't exist, there's nothing
to
> rewrite.)

Never said it was an *SA*  problem, I tagged it as an Enhancement.
SA seems to me the most reasonable place to "fix" this overall
mail-food-chain problem though.

Comment 8 Loren Wilton 2004-09-23 22:39:08 UTC
Subject: Re:  Add an X-header to know when added a Subject header

> Also, I definitely agree with Theo about "no" on kludges.  I think my
> solution would work fine, you just remove with a slightly looser regular
> expression.

I'll agree with that.  I'm generally against kludges myself, but in this
case I question  whether what we are talking about is a kludge.

To be pedantic about the definition of "rewrite_header" would indeed mean
that it is a kludge.  But it seems clear from observation that what most
*users* feel it should mean is "reliably install this markup so I can see it
is spam".  This clearly isn't the original developer's intent.  But standing
about 30 feet back on the fence, I can't say I consider the viewpoint to be
at all unreasonable.

As a maker of a number of commercial products myself, I can say with
confidence that the user's viewpoint about how a tool should work or be used
will not always agree with the developer's ideas.  I've found over the years
that it is usually better to at least permit, if not necessarily internally
agree with, the user's interpretation of how things should be used.  About
the only exception is when the user's desires are in complete conflict with
the internal architecture.  I don't think that really pertains here; this is
more a matter of semantics: how strictly or loosely do we interpret the
meaning of the word 'rewrite".  My personal belief *in this case* would be
that we accept that it means "darnit, do whatever you need to do to install
this here markup".

Comment 9 Loren Wilton 2004-09-23 22:42:33 UTC
Subject: Re:  Add an X-header to know when added a Subject header

> In 3.0 it actually would be more difficult since we now take the original
> header and put it in X-Spam-Prev-header with the original one.  Then we
don't
> need to do RE and just move the values around.

Eh?  I'm not sure I followed that.  Are you saying 3.0 makes an
X-Spam-Prev-Subject: header with the original subject before installing
header markup?

If so, then all this "how to tag" stuff seems moot.  There ain't gonna be an
X-Spam-Prev-Subject header if there was no previous Subject.  That seems
like a pretty good clue about what the result should be after remove_markup.

Comment 10 Theo Van Dinter 2004-09-23 22:59:02 UTC
Subject: Re:  Add an X-header to know when added a Subject header

On Thu, Sep 23, 2004 at 10:42:34PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Eh?  I'm not sure I followed that.  Are you saying 3.0 makes an
> X-Spam-Prev-Subject: header with the original subject before installing
> header markup?

Right.

> If so, then all this "how to tag" stuff seems moot.  There ain't gonna be an
> X-Spam-Prev-Subject header if there was no previous Subject.  That seems
> like a pretty good clue about what the result should be after remove_markup.

Well..  Not really.  There doesn't need to be a Prev version, per backward
compatibility.  So the lack of the header doesn't tell you anything in and of
itself.

Quinlan's thing would work for now, but relies on the backward
compatibility code to work (fine for now, but bad juju in general).

Hence my kluge comment -- the requested behavior doesn't really fit in with
the 3.0 way of doing things without doing something special just for this
situation.

Comment 11 Sidney Markowitz 2004-09-24 00:05:19 UTC
How about we say that restoring marked mail to its pristine state works in every
case except the backward compatibility case of marking the Subject header and
not adding X-Spam-Prev-Subject. If someone really wants to use the compatibility
mode then they will have to live with creating an empty Subject header if they
want to recreate pristine mail from one marked as spam that had no Subject.

In terms of usability that is much better than giving a user some spam that does
not have Subject: *****SPAM***** in it. The failure case this way is more rare
and benign in its effects.

If there is a X-Spam-Prev-Subject header it would be easy to give it a special
value to indicate that there was no Subject header. Since it will not look like
the tail of the Subject header it will be obvious that it did not come from the
Subject header.
Comment 12 Justin Mason 2004-09-28 14:37:18 UTC
+1 on Sidney's suggestion
Comment 13 Richard Lynch 2004-10-19 12:53:43 UTC
As a user who was about to report this "bug"...

I gotta say that as far as the end user is concerned, just add the damn subject
line so I can trash the emails by looking for ****SPAM**** :-)

Getting back to the original email of either a blank Subject: line or no
Subject: line at all seems rather picuyane to me.  In either case, there is no
useful subject, and in either case, it's almost-for-sure JUNK that should just
be thrown away anyway.

Does anybody really care whether it was no subject line at all versus a blank
one?  It's trash either way.

Now all I gotta do is get my web host to upgrade SA and call it done. :-)

Actually, I'll just modify my PHP IMAP script that does my custom filtering
(post-SA) to catch this one and trash anything with no subject line, but I'll
prod him to upgrade anyway, to benefit the other users.

Bottom Line:

Most mail clients don't have filters flexible enough to parse the raw spam score.

Many webhosts allow end users to configure SA to rewrite the Subject based on
the score, to allow the end user to decide what is an acceptable level of junk
to wade through without losing real email, and giving that flexibility to the
end user is, in general, a Good Thing (tm).

Thus, the end user is relying on the behaviour that junk email gets the
****SPAM**** marker in the subject line, based on the SA score, in order to
configure their filters to throw away the trash.

PS  I'm looking for a good de-biff function so I can detect intentional
mis-spellings of junk words...  Considering Soundex and similar functions, as
well as simple substitution ciphers comparing to a dictionary of bad words, but
open to other suggestions or refinements.
Comment 14 Jason J Ellingson 2004-10-26 15:21:51 UTC
Isn't "Missing Subject" a rule that gets hit?  If so, that should be enough to 
tell you that the current header was added by SA.
Comment 15 Daniel Quinlan 2004-10-26 15:24:17 UTC
*** Bug 3605 has been marked as a duplicate of this bug. ***
Comment 16 Justin Mason 2004-11-05 16:28:07 UTC
Created attachment 2509 [details]
implementation of fix

OK, here's the fix, as per Sidney's comment that we all seemed to agree on at
last ;)

quick notes:

  - it uses a magic string, "(nonexistent)", in the X-Spam-Prev-Subject header
to indicate that the header didn't exist previously
  - it will *not* remove the Subject header if the X-Spam-Prev-Subject contains
"(nonexistent)" but the Subject does not; so avoiding whatever the forging
issue was
  - it documents the behaviour in M:SpamAssassin:Conf
  - there's a t script to test it
Comment 17 Michael Parker 2004-11-16 12:28:01 UTC
+1
Comment 18 Theo Van Dinter 2004-11-16 12:57:23 UTC
+1
Comment 19 Justin Mason 2004-11-16 16:50:53 UTC
applied! r76069