SA Bugzilla – Bug 7278
redirector_pattern - reverse order so hardcoded check done last
Last modified: 2015-12-21 16:32:40 UTC
Created attachment 5360 [details] Diff file redirector_pattern m{^https?://www\.google\.com/url\?q=([^&]+)}i Attached example p1 - detects "baddomain.com" and adds to URI list. Attached example p2 does not detect the domain. ---8<--- On 18/12/15 14:23, Mark Martinec wrote: Mail::SpamAssassin::Util::uri_list_canonicalize() near line 1346: # deal with http redirectors. strip off one level of redirector # and add back to the array. the foreach loop will go over those # and deal appropriately. # bug 3308: redirectors like yahoo only need one '/' ... <grrr> if ($rest =~ m{(https?:/{0,2}.+)$}i) { push(@uris, $1); } # resort to redirector pattern matching if the generic https? check # doesn't result in a match -- bug 4176 else { foreach (@{$redirector_patterns}) { Note that it tries the hard-coded check first, and skips evaluating redirector patterns when the hard-coded match was successful. So your redirector pattern was not tried at all, and at the same time the hard-coded check obtained an invalid URL: from an URL "/url?q=http://baddomain.com&t=1" it collected as "http://baddomain.com&t=1" instead of "http://baddomain.com" The URI syntax after a '?' should treat "&t=1" as a second argument and not glue it with the first argument "http://baddomain.com". So the resulting invalid URL "http://baddomain.com&t=1" does not match any URI rules for baddomain.com . Please try the attached patch (to 3.4.1 or trunk), it reverses the order of checks: tries redirector_patterns first, and only falls back to a sloppy hard-coded check as a last resort. (and that hard-coded check would better be fixed too) ---8<--- Attached diff seems to fix issue
Created attachment 5361 [details] Example 1
Created attachment 5362 [details] Example 2
Created attachment 5363 [details] Proposed patch: try redirector_patterns before falling back to a hard-coded default The patch reverses the order of searching for an URL target withing a redirector URL: try redirector_patterns before falling back to a hard-coded default. 3.4: Sending lib/Mail/SpamAssassin/Util.pm Committed revision 1720872. trunk: Sending lib/Mail/SpamAssassin/Util.pm Committed revision 1720873. The sloppy parsing of URI query parameters in a fallback check is still unchanged, and should eventually be improved.
looks like a great logic fix to me.