Bug 3075 - fix remove_spamassassin_markup problem and add test
Summary: fix remove_spamassassin_markup problem and add test
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P5 normal
Target Milestone: 3.0.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-02-22 18:01 UTC by Daniel Quinlan
Modified: 2004-02-22 11:45 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Quinlan 2004-02-22 18:01:32 UTC
Gary Funck wrote:

Was reading over remove_spamassassin_markup to see what it does. Was
wondering about this
section of code:

  foreach my $header ( keys  %{$self->{conf}->{rewrite_header}} ) {
    dbg ("Removing markup in $header");
    if ($header eq 'Subject') {
      my $tag = $self->{conf}->{rewrite_header}->{'Subject'};
      $tag = quotemeta($tag);
      $tag =~ s/_HITS_/\\d{2}\\.\\d{2}/g;
      $tag =~ s/_SCORE_/\\d{2}\\.\\d{2}/g;
      $tag =~ s/_REQD_/\\d{2}\\.\\d{2}/g;
      1 while $hdrs =~ s/^Subject: ${tag} /Subject: /gm;
    } else {
      $hdrs =~ s/^(${header}: .*?)\t\([^)]\)$/$1/gm;
    }
  }

In particular, this line:
      $hdrs =~ s/^(${header}: .*?)\t\([^)]\)$/$1/gm;
It looks like it is processing the headers designated for
rewrite and removing <tab>(...) at the end of the line,
where ... means anything inside the parens?  But I think the
pattern needs iteration also: [^)]*?  perhaps?

----------------------------------------

Theo Van Dinter replied:

hrm.  yeah, I would think + would be required there (I think you're
guaranteed to have at least 1 char, although I suppose you could use
_STARS_ which could be 0...)  Ok, so "*", yeah.

Sounds like we need a regression test for rewriting a non-Subject line
(iirc, from and to are the only other ones allowed)...
Comment 1 Duncan Findlay 2004-02-22 20:43:01 UTC
Hmmm... I think I wrote that line (or at least committed it). You're right. Fixed.
Comment 2 Duncan Findlay 2004-02-22 20:45:25 UTC
Fixed other than the regression test. I don't know if we need it, but it won't
hurt. AFAICT it never worked -- I guess I didn't actually test that part of it :-(
Comment 3 Theo Van Dinter 2004-02-22 20:49:38 UTC
Subject: Re:  fix remove_spamassassin_markup problem and add test

On Sun, Feb 22, 2004 at 08:45:26PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Fixed other than the regression test. I don't know if we need it, but it won't
> hurt. AFAICT it never worked -- I guess I didn't actually test that part of it :-(

Hence why a regression test would be good for it. ;)

Comment 4 Gary Funck 2004-02-23 15:14:50 UTC
Subject: RE:  fix remove_spamassassin_markup problem and add test


An additional note/suggestion: have SA *always* rename the original
headers into X-Spam-Orig-Subject, etc. It does this already with
Content-Type headers. Anyway, by doing this, SA is no longer subject
to the vagueries of the current configuration set up (which headers
are rewritten, which tags are referecned), and a simple-minded procmail
script (which I happen to have), or some such, can simply do header renames
to get back the originals -- always. Thus, the process of de-marking-up
becomes: (1) unwrap the embedded original if present, or (2) rename headers.
Well, there is that other report format which is not encoded as a
a multipart part attachment -- that one can generally be found and
recovered,
but it'd be nice (ie, general and uniform in approach) if only the
'safe_copy' form was used when a report is to be added to the message.


Comment 5 Justin Mason 2004-02-23 15:38:48 UTC
Subject: Re:  fix remove_spamassassin_markup problem and add test 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


bugzilla-daemon@bugzilla.spamassassin.org writes:
> An additional note/suggestion: have SA *always* rename the original
> headers into X-Spam-Orig-Subject, etc. It does this already with
> Content-Type headers. Anyway, by doing this, SA is no longer subject
> to the vagueries of the current configuration set up (which headers
> are rewritten, which tags are referecned), and a simple-minded procmail
> script (which I happen to have), or some such, can simply do header renames
> to get back the originals -- always. Thus, the process of de-marking-up
> becomes: (1) unwrap the embedded original if present, or (2) rename headers.
> Well, there is that other report format which is not encoded as a
> a multipart part attachment -- that one can generally be found and
> recovered,
> but it'd be nice (ie, general and uniform in approach) if only the
> 'safe_copy' form was used when a report is to be added to the message.

Isn't this already the case?  If not, +1, it should be.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAOo8SQTcbUG5Y7woRAvUPAKDHaQK11SCbRM1Lt/w+tZaHiohuJwCgm4yY
ytjB5/axaVuYJr8VaCmjGBs=
=JANx
-----END PGP SIGNATURE-----

Comment 6 Daniel Quinlan 2004-02-23 15:49:35 UTC
Subject: Re:  fix remove_spamassassin_markup problem and add test

> An additional note/suggestion: have SA *always* rename the original
> headers into X-Spam-Orig-Subject, etc. It does this already with
> Content-Type headers. Anyway, by doing this, SA is no longer subject
> to the vagueries of the current configuration set up (which headers
> are rewritten, which tags are referecned), and a simple-minded
> procmail script (which I happen to have), or some such, can simply do
> header renames to get back the originals -- always. Thus, the process
> of de-marking-up becomes: (1) unwrap the embedded original if present,
> or (2) rename headers.  Well, there is that other report format which
> is not encoded as a a multipart part attachment -- that one can
> generally be found and recovered, but it'd be nice (ie, general and
> uniform in approach) if only the 'safe_copy' form was used when a
> report is to be added to the message.

> Isn't this already the case?  If not, +1, it should be.

-1

This is why we have report_safe, all of the pristine headers are
included in the original message attachment.  If you don't use
report_safe, then just avoid changing the Subject.

X-Spam-Orig headers do not guarantee an original message at all: header
order can change, header spacing can change, you have to worry about
incoming messages that may already have X-Spam headers, etc.

Daniel

Comment 7 Theo Van Dinter 2004-02-23 15:50:49 UTC
Subject: Re:  fix remove_spamassassin_markup problem and add test

On Mon, Feb 23, 2004 at 03:38:49PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> > headers into X-Spam-Orig-Subject, etc. It does this already with
> > Content-Type headers. Anyway, by doing this, SA is no longer subject

Actually we haven't done X-Spam-Prev-Content-Type, etc, since 2.4x.

> Isn't this already the case?  If not, +1, it should be.

I can see X-Spam-Prev-Subject, etc, being used though.  +1 to that as well. :)

Comment 8 Duncan Findlay 2004-02-23 16:08:20 UTC
Subject: Re:  fix remove_spamassassin_markup problem and add test

-0.314159

That's an ugly hack.