Bug 5250 - 8-bit messages corrupted by rewrite_mail()
Summary: 8-bit messages corrupted by rewrite_mail()
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamassassin (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: PC Windows 2000
: P5 major
Target Milestone: 3.2.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
: 4363 (view as bug list)
Depends on: 4363
Blocks:
  Show dependency tree
 
Reported: 2006-12-18 22:54 UTC by Nikolay V. Buslaev
Modified: 2007-01-23 04:56 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
correct this bug patch None Nikolay V. Buslaev [NoCLA]
sample message, with incorrect behaviour application/octet-stream None Nikolay V. Buslaev [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Nikolay V. Buslaev 2006-12-18 22:54:13 UTC
patch #3204 for bug ยน4363 work incorrect with 8-bit messages (MS-TNEF for 
example).
When You change \r?\n in full message - you can replace single \n to \r\n in 
message body.
Comment 1 Nikolay V. Buslaev 2006-12-18 22:55:28 UTC
Created attachment 3793 [details]
correct this bug
Comment 2 Theo Van Dinter 2006-12-30 19:21:22 UTC
Can you provide more information about what the issue is, a sample message
showing the problem, etc?

Thanks.
Comment 3 Nikolay V. Buslaev 2006-12-31 22:42:10 UTC
Created attachment 3804 [details]
sample message, with incorrect behaviour

sample message, with incorrect behaviour
for examine it - run spamassassin on win32 platform with active perl
Comment 4 Nikolay V. Buslaev 2007-01-15 22:49:13 UTC
how long time need for include this fix?
Comment 5 Justin Mason 2007-01-16 02:57:10 UTC
*** Bug 4363 has been marked as a duplicate of this bug. ***
Comment 6 Justin Mason 2007-01-16 03:11:17 UTC
-1 on that patch...

+		my $msghead = $`;
+		my $msgbody = $';

we cannot use $` and $' in SA -- use of them imposes a speed penalty on the
entire perl interpreter.

however, you're right -- we need to fix a bug here.

The test message contains 8-bit data using Content-Transfer-Encoding: 8bit .
This is a valid MIME message (arbitrary 8-bit data is fine according to rfcs
2045 and 3030).  This is damaged by SpamAssassin due to the s/\r?\n/\r\n/
substitution.

I think your patch is on the right track (replacing \r?\n in the headers only,
rather than in both hdrs+body) -- but it needs to be redone without $` and $'.
Comment 7 Justin Mason 2007-01-22 14:06:21 UTC
ok, fixed... I used bits of your patch ;) just without the $' and $` usage.
Also, report_safe rewriting needed to use the right line endings -- so
implemented that, too.

Committed revision 498825.
Comment 8 Theo Van Dinter 2007-01-22 14:30:23 UTC
I'm not so thrilled with the submitted patch...

There is no guaranteed blank line between the header and the body, so the regexp
isn't necessarily going to do the right thing.

Since the lower down rewrite functions are supposed to be private
(rewrite_mail() is the standard API), we could just change them to return the
header/body separately and then do the appropriate thing.

There's also the question of what to do with report_safe body content since we
do want, imo, the line endings changed there.  Perhaps we'd want it to return
the components, set the appropriate areas to be line ending friendly, then
reassemble...
Comment 9 Justin Mason 2007-01-23 02:10:39 UTC
(In reply to comment #8)
> I'm not so thrilled with the submitted patch...
> 
> There is no guaranteed blank line between the header and the body, so the regexp
> isn't necessarily going to do the right thing.

ok, good point.  I'll look into that.

> Since the lower down rewrite functions are supposed to be private
> (rewrite_mail() is the standard API), we could just change them to return the
> header/body separately and then do the appropriate thing.

true.  I wanted to avoid peppering that code with s// statements, and just have
"one big fixup" before return -- but maybe it would be better to do some
peppering instead.

> There's also the question of what to do with report_safe body content since we
> do want, imo, the line endings changed there.  Perhaps we'd want it to return
> the components, set the appropriate areas to be line ending friendly, then
> reassemble...

that should be fixed already in the checked-in code; it fixes line endings
before inserting the (untouched) original-message message/rfc822 part.
Comment 10 Justin Mason 2007-01-23 04:56:01 UTC
ok, how's the latest checkin look?  I think it addresses your concerns...

: jm 146...; svn commit -m "bug 5250: previous fix didn't deal with messages
with no header/body separator; also, this way is more efficient, by pushing the
header-line-ending encoding nearer to point of insertion in the
rewrite_report_safe() and rewrite_no_report_safe() methods"
lib/Mail/SpamAssassin/PerMsgStatus.pm
Sending        lib/Mail/SpamAssassin/PerMsgStatus.pm
Transmitting file data .
Committed revision 499009.