|
SA Bugzilla – Full Text Bug Listing |
Summary: | google translate redirector_pattern is incomplete | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Chris Myers <cmbso0376> |
Component: | Rules | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | NEW --- | ||
Severity: | enhancement | CC: | cmbso0376, fgnip39, jhardin |
Priority: | P2 | ||
Version: | 3.3.2 | ||
Target Milestone: | Undefined | ||
Hardware: | PC | ||
OS: | Linux | ||
Whiteboard: |
Description
Chris Myers
2013-09-06 01:38:31 UTC
(In reply to Chris Myers from comment #0) > redirector_pattern > m'^http:/*(?:\w+\.)?google(?:\.\w{2,3}){1,2}/translate(_[ct])?\?. > *?(?<=[?&])u=(.*?)(?:$|[&\#])'i ITYM: m'^https?:/* -----------^^ The redirector_pattern in the report began life as a cut-and-paste from my updates_spamassassin_org/72_active.cf file. It really says just http:// rather than https?:// (which I agree is an improvement). My change to the pattern is actually changing .../translate\? to /translate(_[ct])?. errr actually I meant to say "/translate(_[ct])\?" with the backslash. :-( (In reply to Chris Myers from comment #2) > The redirector_pattern in the report began life as a cut-and-paste from my > updates_spamassassin_org/72_active.cf file. It really says just http:// > rather than https?:// (which I agree is an improvement). Indeed? I didn't actually check the current sources - if so, that's a hole. > My change to the > pattern is actually changing .../translate\? to /translate(_[ct])?. ...or /translate(?:_[ct])?\? :) Can you provide a pointer to a spec from Google that documents the possible formats? Or was this just from observation? > Indeed? I didn't actually check the current sources - if so, that's a hole. Yup. Agreed that getting rid of the unneeded backreference is probably a beneficial thing. I don't live-and-breath Perl RE's. I've seen /translate_c and /translate_t referred to by users on the Internet (such as http://googlesystem.blogspot.com/2008/03/useful-google-translate-addresses.html) but didn't find any actual Google doc -- it may be an internal thing rather than part of the public API. This particular bug report is driven by an actual spam message that referenced a URL beginning with: http://translate.google.co.ke/tran%73%6C%61te_c?hl=<omitted> |