Bug 8138 - URIDetail ends spamassassin run on specific anchor text
Summary: URIDetail ends spamassassin run on specific anchor text
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: 4.0.0
Hardware: All All
: P2 normal
Target Milestone: 4.0.1
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-07 16:36 UTC by Wolfgang Breyha
Modified: 2023-06-08 07:14 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
SPAM EML causing the bug message/rfc822 None Wolfgang Breyha [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfgang Breyha 2023-06-07 16:36:37 UTC
Created attachment 5890 [details]
SPAM EML causing the bug

I tried to catch some SPAM using uridetail with a rule like
uri_detail  __ZID_DHL_FAKELAUF  text =~ /confirm/ domain !~ /dhl/

This is already a simplified version.

If I run 
spamassassin -D < <attached.eml> 2>&1
I see that the debug output ends exactly with
... dbg: uri: running __ZID_DHL_FAKELAUF
followed by the mail and no result at all.

The URI in question contains:
<a href="httxx://bad.tld/co">&bull; Click here to confirm sending the shipment</a>

And if I remove the "&bull;" from the anchor text the rule works as expected and spamassassin is able to finish the run.

I tried to add some debug output, but it seems the pure access of the variable $text in the for() loop at URIDetail.pm:206 ends everything.
Comment 1 Henrik Krohns 2023-06-08 07:14:47 UTC
Seems Perl died in this, when $match contained UTF8

dbg("uri: text matched: '%s' %s /%s/", $match,$op,$patt);

Logger.pm called sprintf with raw @args containing UTF8, $message itself was decoded but args not, fixed:

Sending        trunk/lib/Mail/SpamAssassin/Logger.pm
Transmitting file data .done
Committing transaction...
Committed revision 1910293.