Bug 7834 - get_envelope_from may return junk with multiple return-path headers
Summary: get_envelope_from may return junk with multiple return-path headers
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: 3.4 SVN branch
Hardware: PC Windows NT
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-07-06 02:19 UTC by Rob Mosher
Modified: 2020-07-09 10:58 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Rob Mosher 2020-07-06 02:19:15 UTC
I ran into an issue with multiple Return-Path headers.  It seems when there are multiple, get_envelope_from may return a mashed up copy of the strings.

I think a simple fix may be changing this line:

https://github.com/apache/spamassassin/blob/3.4/lib/Mail/SpamAssassin/PerMsgStatus.pm#L3047
  $envf =~ s/>*\s*\z//s;        # remove >, whitespace, newlines

To this:

  $envf =~ s/[>\015\012].*//s;        # remove > and trailing data

This will just scrap anything after the > or possible cr / lf.

This is the behavior I was seeing without the patch:

    $sender = $scanner->get("EnvelopeFrom:addr");
    dbg("spf: pms:from:addr " . $sender);
    dbg("spf: pms:from:raw " . $scanner->get("EnvelopeFrom:raw"));
    dbg("spf: pms:from " . $scanner->get("EnvelopeFrom"));
    dbg("spf: pms:get:rp " . $scanner->get("Return-Path"));

$ grep Return-Path: mail2.txt
Return-Path: <user@domain.com>
Return-Path: <user@domain.com>

dbg: spf: pms:from:addr user@domain.com> <user@domain.com
dbg: spf: pms:from:raw user@domain.com>
dbg: spf: pms:from user@domain.com>
dbg: spf: pms:get:rp <user@domain.com>

But with only one Return-Path:
dbg: spf: pms:from:addr user@domain.com
dbg: spf: pms:from:raw user@domain.com
dbg: spf: pms:from user@domain.com
dbg: spf: pms:get:rp <user@domain.com>
Comment 1 Henrik Krohns 2020-07-09 08:34:20 UTC
I can't quickly reproduce it.

Please post exact version you are using, and a complete test email file.
Comment 2 Rob Mosher 2020-07-09 09:36:26 UTC
This was on 3.4.4, but I believe I saw this on 3.2 as well.

$ spamassassin -V
SpamAssassin version 3.4.4
  running on Perl version 5.26.1

I've sanitized this message a bit, but left most of the important bits in place.  Apparently something with qmail/vmailmgr is causing the return-paths to be written twice up top.  This trips up the parser as indicated.

$ cat file | spamassassin -x -t -D 2>&1 | grep 'pms:'
Jul  9 05:14:29.398 [25536] dbg: spf: pms:from:addr user@example.com> <user@example.com
Jul  9 05:14:29.398 [25536] dbg: spf: pms:from:raw user@example.com>
Jul  9 05:14:29.398 [25536] dbg: spf: pms:from user@example.com>
Jul  9 05:14:29.398 [25536] dbg: spf: pms:get:rp <user@example.com>

Return-Path: <user@example.com>
Delivered-To: vmailacct-rmosher@example.com
Return-Path: <user@example.com>
Delivered-To: vmailacct-rmosher@example.com
Received: (qmail 7982 invoked from network); 5 Jul 2020 20:47:56 -0000
Received: from mail.example.com (1.1.1.1)
  by mail2.example.com with SMTP; 5 Jul 2020 20:47:56 -0000
Received: from mail.example.com (mail.example.com [2.2.2.2])
  by example.com
  for <rmosher@example.com>; Sun, 5 Jul 2020 13:47:45 -0700
From: Sending User <user@example.com>
Subject: Quick test mail
Date: Sun, 5 Jul 2020 13:47:43 -0700
To: Rob Mosher <rmosher@example.com>



With just one return path (removed top two lines), it works fine...

$ cat file | spamassassin -x -t -D 2>&1 | grep 'pms:'
Jul  9 05:16:01.288 [25777] dbg: spf: pms:from:addr user@example.com
Jul  9 05:16:01.288 [25777] dbg: spf: pms:from:raw user@example.com
Jul  9 05:16:01.288 [25777] dbg: spf: pms:from user@example.com
Jul  9 05:16:01.288 [25777] dbg: spf: pms:get:rp <user@example.com>


However, in some cases like mailing lists or forwarders, there may be another Return-Path. In this case I'm seeing empty data returned.

Return-Path: <mailing-list-admin@example.com>
Delivered-To: vmailacct-rmosher@example.com
Received: (qmail 7982 invoked from network); 5 Jul 2020 20:47:56 -0000
Received: from mail.example.com (1.1.1.1)
  by mail2.example.com with SMTP; 5 Jul 2020 20:47:56 -0000
Return-Path: <user@example.com>
Received: from mail.example.com (mail.example.com [2.2.2.2])
        by example.com
        for <rmosher@example.com>; Sun, 5 Jul 2020 13:47:45 -0700
From: Sending User <user@example.com>
Subject: Quick test mail
Date: Sun, 5 Jul 2020 13:47:43 -0700
To: Rob Mosher <rmosher@example.com>


$ cat file | spamassassin -x -t -D 2>&1 | grep 'pms:'
Jul  9 05:17:11.015 [25866] dbg: spf: pms:from:addr
Jul  9 05:17:11.015 [25866] dbg: spf: pms:from:raw
Jul  9 05:17:11.016 [25866] dbg: spf: pms:from
Jul  9 05:17:11.016 [25866] dbg: spf: pms:get:rp <mailing-list-admin@example.com>


Specifying 'envelope_sender_header Return-Path' in config seems to fix both of these cases as that portion of code is never reached, bug the bug is present when that is not specified.

Changing the regex as indicated in the original bug report fixes the issue for the first case.
  $envf =~ s/[>\015\012].*//s;        # remove > and trailing data

The second case with empty data appears related to this logic, which is never accessed if envelope_sender_header is set.

    if ($self->get("ALL") =~ /^Received:.*?^Return-Path:/smi) {
      dbg("message: Return-Path header found after 1 or more Received lines, cannot trust envelope-from");
    } else {
      goto ok;
    }
Comment 3 Henrik Krohns 2020-07-09 10:51:28 UTC
Atleast the first case is already fixed in trunk, backporting changes:

Sending        spamassassin-3.4/lib/Mail/SpamAssassin/Message/Metadata/Received.pm
Sending        spamassassin-3.4/lib/Mail/SpamAssassin/PerMsgStatus.pm
Transmitting file data ..done
Committing transaction...
Committed revision 1879700.

Need to look at the forwarder stuff..
Comment 4 Henrik Krohns 2020-07-09 10:58:36 UTC
(In reply to Henrik Krohns from comment #3)
>
> Need to look at the forwarder stuff..

There's probably discussions in old bugs, but what should we do about this? I think it's by design that EnvelopeFrom isn't used in forwarder situations. Should we trust the next Return-Path if it's inside trusted networks, or what?