|Summary:||MergeSlashes option breaks protocol specifier in URIs|
|Product:||Apache httpd-2||Reporter:||Thomas Jarosch <thomas.jarosch>|
|Component:||Core||Assignee:||Apache HTTPD Bugs Mailing List <bugs>|
Description Thomas Jarosch 2019-05-15 13:55:10 UTC
Hello together, we use mod_proxy as a forward proxy for outgoing web traffic. Version 2.4.39 introduced the new MergeSlashes option which defaults to ON. This breaks the protocol specifier of URIs, especially in the data structure apfilter_t->r->uri as used by mod_proxy. Here's a logged URI with MergeSlashes ON: http:/2016.eicar.org/download/eicar.com -> the second slash in the URI after "http:/" got eaten. Turning MergeSlashes OFF fixes the issue. I guess this is an unwanted side effect of the new feature :) Best regards, Thomas Jarosch
Comment 1 Eric Covener 2019-05-15 14:32:27 UTC
Thanks for the report, basic FWD proxy seems to work for me without any change. It's interesting that you mentioned a filter related pointer and "logging". Can you elaborate a bit on the symptom/config/logs? Is it only an issue with mod_proxy_html? Is the right URL forwarded does it blow up immediately?
Comment 2 Thomas Jarosch 2019-05-15 15:05:25 UTC
The log output is created in a custom output filter chained after mod_proxy. Sorry I didn't give more configuration details, here's the proxy config: <Proxy *> ProxyAddHeaders Off SetOutputFilter fsav </Proxy> The filter code is like this: apr_status_t fsav_filter(ap_filter_t *f, apr_bucket_brigade *buckets); ap_register_output_filter_protocol("fsav", fsav_filter, NULL, AP_FTYPE_CONTENT_SET, 0); I've just added this debug logger at the start of the filter function: ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, f->r, "[%d] Debug issue #63437: %s", getpid(), f->r->uri); Output: [Wed May 15 17:02:44.468188 2019] [fsav:error] [pid 19206:tid 3062889280] [client 127.0.0.1:44788]  Debug issue #63437: http:/eicar.org/ -> the URI is already broken.
Comment 3 Joe Orton 2019-09-13 14:44:01 UTC
What's the proxy configuration here?
Comment 4 Thomas Jarosch 2019-09-13 14:48:53 UTC
> What's the proxy configuration here? The exact configuration options are in comment #2, the proxy itself is used for outgoing Internet traffic. We use mod_proxy to apply additional filtering for outgoing requests.
Comment 5 Joe Orton 2019-09-13 15:11:46 UTC
That's not a proxy configuration. Do you have a forward proxy (ProxyRequests on) or a reverse proxy (ProxyPass ...)?
Comment 6 Thomas Jarosch 2019-09-13 15:19:18 UTC
Ah sorry, nothing special going on (I hope): ProxyRequests On ProxyVia Block ProxyBadHeader Ignore and obviously related to this ticket: MergeSlashes Off
Comment 7 Ruediger Pluem 2019-09-13 15:32:00 UTC
r->uri looks like you describe, but r->uri is not used by mod_proxy_http when used as a forward proxy. Have a look at r->filename and you see the correct the correct URL after the prefix 'proxy:'. This URL is actually used by mod_proxy_http.
Comment 8 Thomas Jarosch 2019-10-02 13:32:51 UTC
I looked around our module's git history and we're using r->uri since 2003 without any issue in this "proxy filter" module chained after mod_proxy_http. The forward proxy is running in an own instance these days, so it was easy for us to configure MergeSlashes to OFF without any bad side effects. So we have a working workaround. The question is: Is it worth fixing to turn the r->uri field into a proper URI again? Could the MergeSlashes code be changed into skipping any protocol identifier like http:// ftp:// xyz:// in the beginning of an URI? Or would this just be half of a solution since further slashes in URLs still get merged? I'm just afraid that other modules might also be affected. A few other places in the proxy code make use of r->uri, so they would need to be inspected: * mod_proxy_balancer.c * error logging in mod_proxy_http.c