Bug 7278 - redirector_pattern - reverse order so hardcoded check done last
Summary: redirector_pattern - reverse order so hardcoded check done last
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: 3.4.1
Hardware: PC All
: P2 normal
Target Milestone: 3.4.2
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-18 21:23 UTC by Paul Stead
Modified: 2015-12-21 16:32 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Diff file patch None Paul Stead [HasCLA]
Example 1 message/rfc822 None Paul Stead [HasCLA]
Example 2 message/rfc822 None Paul Stead [HasCLA]
Proposed patch: try redirector_patterns before falling back to a hard-coded default patch None Mark Martinec [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Stead 2015-12-18 21:23:27 UTC
Created attachment 5360 [details]
Diff file

redirector_pattern m{^https?://www\.google\.com/url\?q=([^&]+)}i 

Attached example p1 - detects "baddomain.com" and adds to URI list. Attached example p2 does not detect the domain.

---8<---

On 18/12/15 14:23, Mark Martinec wrote:

Mail::SpamAssassin::Util::uri_list_canonicalize()
near line 1346:

       # deal with http redirectors.  strip off one level of redirector
       # and add back to the array.  the foreach loop will go over those
       # and deal appropriately.
       # bug 3308: redirectors like yahoo only need one '/' ... <grrr>
       if ($rest =~ m{(https?:/{0,2}.+)$}i) {
         push(@uris, $1);
       }

       # resort to redirector pattern matching if the generic https?
check
       # doesn't result in a match -- bug 4176
       else {
        foreach (@{$redirector_patterns}) {

Note that it tries the hard-coded check first, and skips
evaluating redirector patterns when the hard-coded match
was successful.

So your redirector pattern was not tried at all, and at the same time
the hard-coded check obtained an invalid URL: from an URL
   "/url?q=http://baddomain.com&t=1"
it collected as "http://baddomain.com&t=1" instead of
"http://baddomain.com"
The URI syntax after a '?' should treat "&t=1" as a second argument
and not glue it with the first argument "http://baddomain.com".

So the resulting invalid URL "http://baddomain.com&t=1"
does not match any URI rules for baddomain.com .


Please try the attached patch (to 3.4.1 or trunk), it reverses
the order of checks: tries redirector_patterns first, and only
falls back to a sloppy hard-coded check as a last resort.
(and that hard-coded check would better be fixed too)

---8<---

Attached diff seems to fix issue
Comment 1 Paul Stead 2015-12-18 21:23:53 UTC
Created attachment 5361 [details]
Example 1
Comment 2 Paul Stead 2015-12-18 21:24:06 UTC
Created attachment 5362 [details]
Example 2
Comment 3 Mark Martinec 2015-12-19 00:52:10 UTC
Created attachment 5363 [details]
Proposed patch: try redirector_patterns before falling back to a hard-coded default

The patch reverses the order of searching for an URL target
withing a redirector URL: try redirector_patterns before
falling back to a hard-coded default.

3.4:
  Sending lib/Mail/SpamAssassin/Util.pm
Committed revision 1720872.

trunk:
  Sending lib/Mail/SpamAssassin/Util.pm
Committed revision 1720873.


The sloppy parsing of URI query parameters in a fallback check
is still unchanged, and should eventually be improved.
Comment 4 Kevin A. McGrail 2015-12-21 16:32:40 UTC
looks like a great logic fix to me.