Bug 8105 - DecodeShortURLs fails to follow redirect chains with differing query parameters
Summary: DecodeShortURLs fails to follow redirect chains with differing query parameters
Status: NEW
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: 4.0.0
Hardware: PC Linux
: P2 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-13 15:01 UTC by Christer Mjellem Strand
Modified: 2023-01-20 13:47 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Christer Mjellem Strand 2023-01-13 15:01:53 UTC
When DecodeShortURLs encounters a returned location header where only[1] the query parameters differ, it appears it will not follow it, even if the source/target domain is defined as a url_shortener. I'm guessing it's doing some comparing to prevent an infinite loop, but stripping away things like query string first. Sometimes a good idea, other times not.

Consider this example, pulled from a spam message:

https://rb.gy/qcigz2

As of writing, this returns a location header of:

http://rb.gy/qcigz2?rb.routing.mode=proxy&rb.routing.signature=974538

Same URL, different protocol[1], different (added) query parameters.

If it were to follow this redirect, it would get this location header back:

https://rebrandly.info/stop

Which, in turn, could be used in URL_SHORTENER_DISABLED[2].


[1] I say only even though in this example the protocol also differs. If *only* the protocol differs, however, it does seem to follow the chain.
Consider this:

http://www.shorturl.at/BLOY9

Which redirects to https before redirecting to target. Assuming www.shorturl.at is defined as a url_shortener[3], it *will* follow the redirect chain to the same URL with https[4], and then further.

[2] I guess this is another suggestion for improvement, which could be added, but I don't wanna file individual reports for all these tiny things.

[3] Which it should be, because even though the shorturl.at service provides shortened URLs as shorturl.at - which is in the list - it first redirects every URL to www.shorturl.at (and https) - which isn't. But that's (also?) a different bug, and I'm already working on a larger list update, pulled from numerous sources.

[4] Incidentally, this will also trigger short_url_chained (and in turn an assigned score), which perhaps it shouldn't when redirecting to the same (base) domain. Might be worthy of another report, I suppose.
Comment 1 Christer Mjellem Strand 2023-01-20 13:47:21 UTC
I'm not sure I can reproduce the original issue in this report anymore - it may have been related to caching, or something else. I've seen occurrences of it working in logs since reporting.

I'll leave the report however open, however, as it includes various other minor tidbits which may be worth addressing.