Bug 3605 - rewrite_header will not rewrite a missing header
Summary: rewrite_header will not rewrite a missing header
Status: RESOLVED DUPLICATE of bug 3816
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P3 normal
Target Milestone: 3.0.2
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
: 3618 3809 3867 4018 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-07-13 14:10 UTC by Dallas Engelken
Modified: 2004-12-03 07:39 UTC (History)
4 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
patch to rewrite_no_report_safe patch None Dallas Engelken [HasCLA]
try #2 patch None Dallas Engelken [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Dallas Engelken 2004-07-13 14:10:03 UTC
rewrite_header Subject *****SPAM*****

will not rewrite the supplied header if the email does not contain the header 
specified.

Is this how it should be handled??  Should a missing Subject: header be added 
by the MTA or SA?

WITHOUT Subject:

[root@mailgw include]# echo -e "From:\n\n<a 
href='http://1.1.1.1/remove.cgi'>remove</A>\n" | spamc
From:
X-Spam-Flag: YES
X-Spam-Score: 7.2

WITH BLANK Subject:

[root@mailgw include]# echo -e "Subject:\nFrom:\n\n<a 
href='http://1.1.1.1/remove.cgi'>remove</A>\n" | spamc
Subject: *****SPAM*****
From:
X-Spam-Prev-Subject:
X-Spam-Flag: YES
X-Spam-Score: 7.2

WITH Subject:

[root@mailgw include]# echo -e "Subject: test\nFrom:\n\n<a 
href='http://1.1.1.1/remove.cgi'>remove</A>\n" | spamc
Subject: *****SPAM***** test
From:
X-Spam-Prev-Subject: test
X-Spam-Flag: YES
X-Spam-Score: 6.6

Dallas
Comment 1 Theo Van Dinter 2004-07-13 14:52:26 UTC
Subject: Re:  New: rewrite_header will not rewrite a missing header

On Tue, Jul 13, 2004 at 02:10:04PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> rewrite_header Subject *****SPAM*****
> 
> will not rewrite the supplied header if the email does not contain the header 
> specified.

Well, actually it's added if report_safe > 0 (original is encapsulated,
so not an issue), and it's not if report_safe 0 (safest option since
it'll be difficult to remove the markup and get the original message).

It does this by design. :)

> Is this how it should be handled??  Should a missing Subject: header be added 
> by the MTA or SA?

the mta shouldn't add one (it's not required), and sa works as above.


barring any discussion, this ticket can be closed.

Comment 2 Dallas Engelken 2004-07-15 06:03:52 UTC
well many people use the subject header rewriting for creating inbox rules.   i
think adding Subject: header when hits > reqd and !defined get_header('Subject')
should be required no matter what report_safe is.

when hits < reqd, i could care less, but it sucks getting mail that comes in
scoring over your required hits but there was no subject: header to begin with,
so the message sits in my inbox like a false-negative..  and since this is a
microsoft MUA i use here, i cant create any better rule to look at
X-Spam-Status/Flag.


Comment 3 Dallas Engelken 2004-07-15 07:10:45 UTC
Created attachment 2132 [details]
patch to rewrite_no_report_safe

if no subject header exists, and the message is classified as spam, this patch
will add a blank Subject: header so "rewrite_header Subject" will allow proper
tagging of spam.
Comment 4 Theo Van Dinter 2004-07-15 11:15:36 UTC
Subject: Re:  rewrite_header will not rewrite a missing header

On Thu, Jul 15, 2004 at 07:10:46AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> if no subject header exists, and the message is classified as spam, this patch
> will add a blank Subject: header so "rewrite_header Subject" will allow proper
> tagging of spam.

just to note, I'm -1 on this behavior.  imnsho, the fact that your
MUA can't filter on headers other than Subject isn't reason for us to
modify the message and add "standard" headers that weren't there in the
first place.  it's "rewrite_header", not "rewrite_or_add_header".

Comment 5 Justin Mason 2004-07-15 11:41:14 UTC
Subject: Re:  rewrite_header will not rewrite a missing header 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>> if no subject header exists, and the message is classified as spam,
>> this patch will add a blank Subject: header so "rewrite_header Subject"
>> will allow proper tagging of spam.
>
>just to note, I'm -1 on this behavior.  imnsho, the fact that your MUA
>can't filter on headers other than Subject isn't reason for us to modify
>the message and add "standard" headers that weren't there in the first
>place.  it's "rewrite_header", not "rewrite_or_add_header".

Well, I think I'm +1 -- IMO, if "rewrite_header Subject" is specified,
but the Subject header isn't present, it's very clear what the user
intends to happen, and we should do it.   "rewrite" is just a shorter
word than "rewrite_or_add".

However I'm -1 on the side-effect of adding a Subject: header for
spam if one doesn't already exist, unless "rewrite_header Subject"
is in the config.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFA9s+zQTcbUG5Y7woRAmBnAKCbhl5WLrYvHrMbtOrJJSB93lm5SwCeIdOX
y3E9T/bV6pT7y/Iqtqdrfyk=
=bNIx
-----END PGP SIGNATURE-----

Comment 6 Theo Van Dinter 2004-07-15 12:29:22 UTC
ok, so I've been chatting with a coworker about this...

my issue is that we currently have no way to add in the header, and then mark that it should be 
removed when stripping our markup.  ie: we use "X-Spam-Prev-Header:" to specify the original header 
content, and upon stripping the markup, we simply delete "Header" and rename "X-Spam-Prev-Header" 
appropriately.

so then I thought about just adding "X-Spam-Create-Header", or something, and when we see that 
during the strip process, just delete it and "Header".  but then I figured that was too easy to fake in a 
message.

but in conversing more about it, we can't protect the faking thing anyway using 'Prev' and such, so why 
do I worry about 'Create'?

So here's my goal: I want "spamassassin < msg | spamassassin -d" and "cat msg" to basically return the 
same message.  I don't want headers added, I don't want headers removed, but I'll be ok with some 
headers shuffled in order.  If someone wants to do up a patch to add "X-Spam-Create-Header", then 
deal with it appropriately on remove_spamassassin_markup() calls, so that you can do "rewrite_header 
Subject foo" and have it work, fine, I can go with that.  I'm, however, -1 on doing it without having the 
logic in there to strip the added header back out.
Comment 7 Dallas Engelken 2004-07-15 12:31:22 UTC
Created attachment 2133 [details]
try #2

> However I'm -1 on the side-effect of adding a Subject: header for
> spam if one doesn't already exist, unless "rewrite_header Subject"
> is in the config.

i agree with that.  if i had a client that was not dependent on the Subject
header for filtering, i'd like to be able to filter messages that did not have
a subject: header at all.  the attached diff just adds a check to see if
"rewrite_header Subject" exists before adding a Subject header.

d
Comment 8 Arjen de Korte 2004-07-27 08:39:40 UTC
*** Bug 3618 has been marked as a duplicate of this bug. ***
Comment 9 Michael Parker 2004-09-23 07:09:22 UTC
*** Bug 3809 has been marked as a duplicate of this bug. ***
Comment 10 Michael Bell 2004-09-29 12:04:16 UTC
Weighing in with my tiny noggin:

I agree with Dallas and Justin. Several of my customers have mentioned this and 
say (though who knows) this is new behavior with SA 3. It causes them much 
sadness that they cannot filter/rearrange such messages easily. 

I do understand your concern about not being able strip the added subject out 
in these cases though, as it becomes confusing as to the learning process etc, 
as to whether that subject originally existed or not. Since a subject header 
string can consist of more than constants, it's not just a matter of removing 
those. (and that would be kludgy anyway)

The added X-header to serve as a status flag would work, but indeed be all too 
easily forged.


Comment 11 Theo Van Dinter 2004-10-04 13:38:51 UTC
*** Bug 3867 has been marked as a duplicate of this bug. ***
Comment 12 Daryl C. W. O'Shea 2004-10-24 03:15:12 UTC
From Theo Van Dinter  2004-07-15 12:29
> ie: we use "X-Spam-Prev-Header:" to specify the original header content, and
> upon stripping the markup, we simply delete "Header" and rename
> "X-Spam-Prev-Header" appropriately.
>
> so then I thought about just adding "X-Spam-Create-Header", or something, and
> when we see that during the strip process, just delete it and "Header".  but
> then I figured that was too easy to fake in a message.
>
> but in conversing more about it, we can't protect the faking thing anyway
> using 'Prev' and such, so why do I worry about 'Create'?

As long as the header is uniqued to something the sender doesn't know about it
can't be faked.

X-Spam-_HOSTNAME_-Created-Header
X-Spam-_HOSTNAME_-Prev-Header

To determine if they were added by the recipients invocation of SpamAssassin you
would compare the headers to the hostname found in X-Spam-Checker-Version.

Of course that only works if the X-Spam-Check-Version header exists, which it
may not when called from certain milters, etc.  In that case we're out of luck
anyway, since we don't know what is going to be added.

If the SpamAssassin call, via a milter, sees the received header added by the
server that it is running on, we can use the digits from that received header's
date (with or without the time zone) joined with a dot or dash to a count of the
number of headers currently in the message and use that as the unique token in
place of _HOSTNAME_ above.  If we can't see the local system's received header
we can't do any of the following.

eg. currently 15 received headers are in the message and the LOCAL SERVER's
received header date is Sun, 24 Oct 2004 05:56:24 -0400

X-Spam-242004055624-15-Created-Header
X-Spam-242004055624-15-Prev-Header

To prevent forgery by adding these headers based on an earlier received header
we add one additional uniqued header to messages that already have these faked
unique headers.  We used the same scheme to generate the unique number.

eg. Message comes in with the following header (which match the date in the 14th
header):

X-Spam-242004055021-14-Created-Header

we add:

X-Spam-242004055624-15-Unique-Header
       dddddddddddd-dd


To remove markup later we do the following:

1. Scan the message for (many) X-Spam-dddddddddddd-dd-Unique-Header.
   Sort them in descending order by the number of headers (last -dd).
   The "unique-header" with the highest -dd value that is valid (against that
   received headers date wins.  From now on we only pay attention to
   "unique-headers" with the same -dd value.

2. Remove this "unique-header" from the message.  We added it.  Leave the rest.

3. Look for other "unique-headers" matching the -dd value.
   Reverse whatever action that header signifies.  Remove it, we added it.

In the easier case where there were no existing fake "unique-headers" when we
first scanned the message, we can assume that all "unique-headers" are valid and
act on the accordingly, then remove them.  There will be no header as seen in #1
above since we don't bother adding it since it wouldn't be useful.
Comment 13 Daryl C. W. O'Shea 2004-10-24 05:03:07 UTC
It just occured to me for about the third time that if there isn't an
X-Spam-Checker-Version header than there likely isn't going to be any other SA
headers, at least for a default config.

So the easy _HOSTNAME_ way should do it...

X-Spam-_HOSTNAME_-Created-Header
X-Spam-_HOSTNAME_-Prev-Header

In the probably rare event that X-Spam-Checker-Version is missing but there is
other SA markup the above two headers can be dealt with appropriately under the
assumption that they aren't forged.

If need be the _HOSTNAME_ could be checked against the hostname of the system
spamassassin -d is run on, against internal_networks, or a hostname could be
accepted after on the comand line with -d, such as spamassassin -d hostname
Comment 14 Troy Engel 2004-10-26 09:14:13 UTC
Why not just add a new config variable to control the addition of the Subject:
header when one is missing?

  rewrite_header      Subject [*SPAM*]
  rewrite_header_add  Subject 1
  report_safe         0

That way the site admin can control the behaviour of a missing header. Having
just upgraded our company from 2.64 to 3.01 last night, I'm already facing this
new behaviour if missing subjects and having to explain it. Highly undesireable
- if I set that config 'rewrite_header Subject', by god it better get added. Do
not try to out-think your site admins, sometimes they know what they want and
what their internal users need. ;)

Theo - the argument that the user should get a better MUA does not hold water -
it's the job of programmers (sic) to make the lives of the end user easier, not
the end user pleasing the programmer because of their ideals. At the end of the
day, we're trying to catch spam - both technically and from a user perspective.
The user won't be giving up their beloved Outlook Expresss (or whatever) to
please SpamAssassin, that's for sure. We must adapt to the needs.
Comment 15 Theo Van Dinter 2004-10-26 10:07:22 UTC
Subject: Re:  rewrite_header will not rewrite a missing header

On Tue, Oct 26, 2004 at 09:14:14AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Theo - the argument that the user should get a better MUA does not hold water -

If you read the discussion comments, you'll notice that I never said they
should change MUAs.  What I said (comment #4) was that we shouldn't add
previously non-existing message headers just because the end user's MUA
can only filter on "Subject".


As far as SA is concerned, I'm somewhere between:

a) we should not go adding standard headers (such as Subject) if they don't
   already exist.  Can we even do this with things like DomainKeys?

and

b) we should rewrite all of the headers -- add the Subject if necessary, fix
   non-RFC compliant Content-Type headers, etc, etc.  basically make the
   output from SA less likely to cause issues "downstream".

Comment 16 Justin Mason 2004-10-26 11:36:48 UTC
I suggest we fix this for 3.0.2.   It's becoming a very big FAQ, and, while it
makes for some elegance in the code, it's entirely nonintuitive behaviour.

also: the paranoia about a spammer adding their own X-Spam-Created-Header:
header seems a bit OTT.  what's the danger here?

   - spammer sends spam
   - SA marks it as spam, message gets filed to user's "spam" folder
   - user eventually, for some reason, decides to "spamassassin -d" the message
   - output may be != original spam

1. that doesn't help the spammer get past SA in any way.
2. we can avoid this by removing X-Spam-Created-Header headers at rewrite time,
same as we should be doing with X-Spam-Prev headers.
3. we're *not* doing that with X-Spam-Prev headers *anyway*, so it appears it
really hasn't been a big deal so far and we shouldn't be worrying so much about
it in this case ;)


oh, and on the a/b thing:  I'm for (b) -- 'we should rewrite all of the headers
-- add the Subject if necessary, fix non-RFC compliant Content-Type headers,
etc, etc.  basically make the output from SA less likely to cause issues
"downstream".'   I wouldn't want to see a deliberately corrupted message cause
MUA crashes, compromises, exploits etc. when we might be able to easily avoid it.

in fact, earlier versions of SA (iirc) used to defang "Return-Receipt-To" and
similar headers, but 3.0.0 doesn't any more.... which probably is not a good thing.
Comment 17 Theo Van Dinter 2004-10-26 12:06:56 UTC
Subject: Re:  rewrite_header will not rewrite a missing header

On Tue, Oct 26, 2004 at 11:36:49AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I suggest we fix this for 3.0.2.   It's becoming a very big FAQ, and, while it
> makes for some elegance in the code, it's entirely nonintuitive behaviour.

I don't really care as long as we consciously make the decision that "cat msg |
spamassassin | spamassassin -d > msg2" will result in msg2 being different
than msg, and that behavior is ok.

If we want to go that way, just add the Subject header.  It'll be the same as
the rest of option (b) which I mentioned earlier.

The main problem with (b), of course, is that "mass-check" will get
different results than "spamassassin".  For instance MISSING_SUBJECT
will trigger on the original, but not post rewrite.

Comment 18 Justin Mason 2004-10-26 12:30:52 UTC
'I don't really care as long as we consciously make the decision that "cat msg |
spamassassin | spamassassin -d > msg2" will result in msg2 being different
than msg, and that behavior is ok.'

wouldn't the X-Spam-Created-Header idea avoid that though?
Comment 19 Theo Van Dinter 2004-10-26 12:45:47 UTC
Subject: Re:  rewrite_header will not rewrite a missing header

On Tue, Oct 26, 2004 at 12:30:53PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> wouldn't the X-Spam-Created-Header idea avoid that though?

This gets back to the a/b thing.  If we're going to do b, there's no way we're
going to be able to get the original back.

I'd really like to avoid adding another kluge into the code.

Comment 20 Justin Mason 2004-10-26 12:58:10 UTC
y'see, IMO in this case it's just a kluge when viewed in terms of "code
elegance".    it's an extremely nonintuitive misfeature as it stands, and that
needs to be fixed.   brokenness > elegance ;)
Comment 21 Theo Van Dinter 2004-10-26 13:21:26 UTC
Subject: Re:  rewrite_header will not rewrite a missing header

On Tue, Oct 26, 2004 at 12:58:11PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> y'see, IMO in this case it's just a kluge when viewed in terms of "code
> elegance".    it's an extremely nonintuitive misfeature as it stands, and that
> needs to be fixed.   brokenness > elegance ;)

I guess it depends if you think the current behavior is actually broken
or not.  As far as I'm concerned, the code works 100% as it's designed
to do -- there's nothing broken there.  The issue at hand is how to go
about adding (back) in a feature.  That has everything to do with not
kluging up the code.

Just so we're clear here: I'm not against making the change, I'm against
making the change in the wrong way.

Comment 22 Justin Mason 2004-10-26 13:33:35 UTC
ok, maybe the code *does* work as it's designed to do.  in that case, the design
is broken and needs fixing. ;)
Comment 23 Michael Parker 2004-10-26 13:41:18 UTC
Subject: Re:  rewrite_header will not rewrite a missing header

I think we should add a config option for this, I don't care what the
default is, but I'd rather not add subject headers.

Michael
Comment 24 Justin Mason 2004-10-26 14:37:06 UTC
I'd prefer to just change the semantics of rewrite_header from:

  "rewrite an existing header"

to

  "rewrite an existing header, or add one if one does not exist".

if necessary, we can add a new rewrite_header-like setting that uses the
"rewrite only existing header" semantics.  I think this would conform more
closely with backwards-compatible behaviour and with the Principle of Least
Surprise -- http://en.wikipedia.org/wiki/Principle_of_least_astonishment ,
http://www.faqs.org/docs/artu/ch11s01.html .
Comment 25 Sidney Markowitz 2004-10-26 15:00:34 UTC
What about the discussion in bug 3816 ?

I'm confused having the discussion of this issue in two places.

Over there, didn't we end up with the only problem being one with backward
compatibility in the minor case in which someone wanted to rewrite the Subject
but chooses not to have a prev header added? We can solve that by saying we are
not backward compatible in that one case.

The result is that if someone configures rewrite_header Subject ***SPAM*** they
are able to filter spam based on the Subject header, and if someone wants to do
that and get reasonable results out of mass-check they make sure that they do
make use of the prev header option.

See Comment 11 in bug 3816.
Comment 26 Justin Mason 2004-10-26 15:15:13 UTC
agh, same issue discussed in two places!  what a mess!

yes. let's go with what was agreed in 3816, then; I'm happy with that.  and
let's close this bug as a DUP.
Comment 27 Daniel Quinlan 2004-10-26 15:24:16 UTC

*** This bug has been marked as a duplicate of 3816 ***
Comment 28 Daniel Quinlan 2004-12-03 16:39:50 UTC
*** Bug 4018 has been marked as a duplicate of this bug. ***