Bug 63437

Summary: MergeSlashes option breaks protocol specifier in URIs
Product: Apache httpd-2 Reporter: Thomas Jarosch <thomas.jarosch>
Component: CoreAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: NEW ---    
Severity: normal CC: thomas.jarosch
Priority: P2    
Version: 2.4.39   
Target Milestone: ---   
Hardware: PC   
OS: Linux   

Description Thomas Jarosch 2019-05-15 13:55:10 UTC
Hello together,

we use mod_proxy as a forward proxy for outgoing web traffic. Version 2.4.39 introduced the new MergeSlashes option which defaults to ON.

This breaks the protocol specifier of URIs, especially in the data structure
apfilter_t->r->uri as used by mod_proxy.

Here's a logged URI with MergeSlashes ON:


-> the second slash in the URI after "http:/" got eaten.

Turning MergeSlashes OFF fixes the issue. I guess this is an unwanted side effect of the new feature :)

Best regards,
Thomas Jarosch
Comment 1 Eric Covener 2019-05-15 14:32:27 UTC
Thanks for the report, basic FWD proxy seems to work for me without any change.

It's interesting that you mentioned a filter related pointer and "logging". Can you elaborate a bit on the symptom/config/logs?  Is it only an issue with mod_proxy_html? Is the right URL forwarded does it blow up immediately?
Comment 2 Thomas Jarosch 2019-05-15 15:05:25 UTC
The log output is created in a custom output filter chained after mod_proxy. Sorry I didn't give more configuration details, here's the proxy config:

<Proxy *>
   ProxyAddHeaders Off
   SetOutputFilter fsav

The filter code is like this:

apr_status_t fsav_filter(ap_filter_t *f, apr_bucket_brigade *buckets);

ap_register_output_filter_protocol("fsav", fsav_filter, NULL, AP_FTYPE_CONTENT_SET, 0);

I've just added this debug logger at the start of the filter function:

    ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, f->r, "[%d] Debug issue #63437: %s", getpid(), f->r->uri);

[Wed May 15 17:02:44.468188 2019] [fsav:error] [pid 19206:tid 3062889280] [client] [19206] Debug issue #63437: http:/eicar.org/

-> the URI is already broken.
Comment 3 Joe Orton 2019-09-13 14:44:01 UTC
What's the proxy configuration here?
Comment 4 Thomas Jarosch 2019-09-13 14:48:53 UTC
> What's the proxy configuration here?

The exact configuration options are in comment #2, the proxy itself is used for outgoing Internet traffic.

We use mod_proxy to apply additional filtering for outgoing requests.
Comment 5 Joe Orton 2019-09-13 15:11:46 UTC
That's not a proxy configuration.  Do you have a forward proxy (ProxyRequests on) or a reverse proxy (ProxyPass ...)?
Comment 6 Thomas Jarosch 2019-09-13 15:19:18 UTC
Ah sorry, nothing special going on (I hope):

ProxyRequests On
ProxyVia Block
ProxyBadHeader Ignore

and obviously related to this ticket:

MergeSlashes Off
Comment 7 Ruediger Pluem 2019-09-13 15:32:00 UTC
r->uri looks like you describe, but r->uri is not used by mod_proxy_http when used as a forward proxy. Have a look at r->filename and you see the correct the correct URL after the prefix 'proxy:'. This URL is actually used by mod_proxy_http.
Comment 8 Thomas Jarosch 2019-10-02 13:32:51 UTC
I looked around our module's git history and we're using r->uri since 2003 without any issue in this "proxy filter" module chained after mod_proxy_http.

The forward proxy is running in an own instance these days, so it was easy for us to configure MergeSlashes to OFF without any bad side effects. So we have a working workaround.

The question is: Is it worth fixing to turn the r->uri field into a proper URI again?

Could the MergeSlashes code be changed into skipping any protocol identifier like http:// ftp:// xyz:// in the beginning of an URI? Or would this just be half of a solution since further slashes in URLs still get merged?

I'm just afraid that other modules might also be affected. A few other places in the proxy code make use of r->uri, so they would need to be inspected:

* mod_proxy_balancer.c
* error logging in mod_proxy_http.c