Bug 63437 - MergeSlashes option breaks protocol specifier in URIs
Summary: MergeSlashes option breaks protocol specifier in URIs
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.4.39
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-05-15 13:55 UTC by Thomas Jarosch
Modified: 2020-03-25 11:38 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Jarosch 2019-05-15 13:55:10 UTC
Hello together,

we use mod_proxy as a forward proxy for outgoing web traffic. Version 2.4.39 introduced the new MergeSlashes option which defaults to ON.

This breaks the protocol specifier of URIs, especially in the data structure
apfilter_t->r->uri as used by mod_proxy.

Here's a logged URI with MergeSlashes ON:

http:/2016.eicar.org/download/eicar.com

-> the second slash in the URI after "http:/" got eaten.

Turning MergeSlashes OFF fixes the issue. I guess this is an unwanted side effect of the new feature :)

Best regards,
Thomas Jarosch
Comment 1 Eric Covener 2019-05-15 14:32:27 UTC
Thanks for the report, basic FWD proxy seems to work for me without any change.

It's interesting that you mentioned a filter related pointer and "logging". Can you elaborate a bit on the symptom/config/logs?  Is it only an issue with mod_proxy_html? Is the right URL forwarded does it blow up immediately?
Comment 2 Thomas Jarosch 2019-05-15 15:05:25 UTC
The log output is created in a custom output filter chained after mod_proxy. Sorry I didn't give more configuration details, here's the proxy config:

<Proxy *>
   ProxyAddHeaders Off
   SetOutputFilter fsav
</Proxy>


The filter code is like this:

apr_status_t fsav_filter(ap_filter_t *f, apr_bucket_brigade *buckets);

ap_register_output_filter_protocol("fsav", fsav_filter, NULL, AP_FTYPE_CONTENT_SET, 0);


I've just added this debug logger at the start of the filter function:

    ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, f->r, "[%d] Debug issue #63437: %s", getpid(), f->r->uri);


Output:
[Wed May 15 17:02:44.468188 2019] [fsav:error] [pid 19206:tid 3062889280] [client 127.0.0.1:44788] [19206] Debug issue #63437: http:/eicar.org/

-> the URI is already broken.
Comment 3 Joe Orton 2019-09-13 14:44:01 UTC
What's the proxy configuration here?
Comment 4 Thomas Jarosch 2019-09-13 14:48:53 UTC
> What's the proxy configuration here?

The exact configuration options are in comment #2, the proxy itself is used for outgoing Internet traffic.

We use mod_proxy to apply additional filtering for outgoing requests.
Comment 5 Joe Orton 2019-09-13 15:11:46 UTC
That's not a proxy configuration.  Do you have a forward proxy (ProxyRequests on) or a reverse proxy (ProxyPass ...)?
Comment 6 Thomas Jarosch 2019-09-13 15:19:18 UTC
Ah sorry, nothing special going on (I hope):

ProxyRequests On
ProxyVia Block
ProxyBadHeader Ignore

and obviously related to this ticket:

MergeSlashes Off
Comment 7 Ruediger Pluem 2019-09-13 15:32:00 UTC
r->uri looks like you describe, but r->uri is not used by mod_proxy_http when used as a forward proxy. Have a look at r->filename and you see the correct the correct URL after the prefix 'proxy:'. This URL is actually used by mod_proxy_http.
Comment 8 Thomas Jarosch 2019-10-02 13:32:51 UTC
I looked around our module's git history and we're using r->uri since 2003 without any issue in this "proxy filter" module chained after mod_proxy_http.

The forward proxy is running in an own instance these days, so it was easy for us to configure MergeSlashes to OFF without any bad side effects. So we have a working workaround.

The question is: Is it worth fixing to turn the r->uri field into a proper URI again?

Could the MergeSlashes code be changed into skipping any protocol identifier like http:// ftp:// xyz:// in the beginning of an URI? Or would this just be half of a solution since further slashes in URLs still get merged?

I'm just afraid that other modules might also be affected. A few other places in the proxy code make use of r->uri, so they would need to be inspected:

* mod_proxy_balancer.c
* error logging in mod_proxy_http.c
Comment 9 chris@familie-hilgers.com 2020-03-24 16:53:33 UTC
Here is a reproducer to show the impact of the fix on a forward Proxy using ProxyRequests On


quoting from: https://httpd.apache.org/security/vulnerabilities_24.html
Apache httpd URL normalization inconsistincy (CVE-2019-0220)

When the _path_ component of a request URL contains multiple consecutive slashes ('/'), directives such as LocationMatch and RewriteRule must account for duplicates in regular expressions while other aspects of the servers processing will implicitly collapse them.

The path component is defined here: https://tools.ietf.org/html/rfc1738
3.3. HTTP

   The HTTP URL scheme is used to designate Internet resources
   accessible using HTTP (HyperText Transfer Protocol).

   The HTTP protocol is specified elsewhere. This specification only
   describes the syntax of HTTP URLs.

   An HTTP URL takes the form:

      http://<host>:<port>/<path>?<searchpart>  <<<<------

   where <host> and <port> are as described in Section 3.1. If :<port>
   is omitted, the port defaults to 80.  No user name or password is
   allowed.  <path> is an HTTP selector, and <searchpart> is a query
   string. The <path> is optional, as is the <searchpart> and its
   preceding "?". If neither <path> nor <searchpart> is present, the "/"
   may also be omitted.

   Within the <path> and <searchpart> components, "/", ";", "?" are
   reserved.  The "/" character may be used within HTTP to designate a
   hierarchical structure.


The CVE Fix does not only merge slashes in the path part but does it on http://, too.

Her a reproducer based on a fresh Debian 10 installation.

install debian 10
a2enmod proxy rewrite proxy_http ssl
cd /etc/apache2/sites-available

add to 000-default.conf:
...

        LogLevel rewrite:trace4
        ProxyRequests On
        ProxyVia On
        SSLProxyEngine on

        #w/a2
        #MergeSlashes off
        <Proxy *>
                RewriteEngine On
                #w/a1
                #RewriteCond %{REQUEST_URI} http:/httpd.apache.org/(.*)
                RewriteCond %{REQUEST_URI} http://httpd.apache.org/(.*)
                RewriteRule .*             https://httpd.apache.org/%1  [P]
        </Proxy>


</VirtualHost>


apachectl configtest
apachectl graceful

curl -x localhost:80  http://httpd.apache.org/weg/

Test 1:
with  RewriteCond %{REQUEST_URI} http://httpd.apache.org/(.*)

results in: /var/log/apache2/error.log:
[Tue Mar 24 17:25:43.905466 2020] [rewrite:trace4] [pid 2995:tid 139950494312192] mod_rewrite.c(483): [client ::1:36732] ::1 - - [httpd.apache.org/sid#7f48cb4a1d20][rid#7f48c83180a0/initial] [perdir */] RewriteCond: input='http:/httpd.apache.org/weg/' pattern='http://httpd.apache.org/(.*)' => not-matched

input='http:/httpd.apache.org/weg/  <<-- http:/


Test 2: w/a1 or w/a2 activated MergeSlashes ON or http:/httpd....
[Tue Mar 24 17:26:47.457483 2020] [rewrite:trace4] [pid 3069:tid 139978218665728] mod_rewrite.c(483): [client ::1:36736] ::1 - - [httpd.apache.org/sid#7f4f3ca36d20][rid#7f4f3c0ac0a0/initial] [perdir */] RewriteCond: input='http://httpd.apache.org/weg/' pattern='http://httpd.apache.org/(.*)' => matched

input='http://httpd.apache.org/weg/' <<-- http://


workaround3 would be replace REQUEST_URI with REQUEST_FILENAME

To me this is either an incomplete fix as REQUEST_FILENAME is not affected. If the CVE does
indicate REQUEST_FILENAME then it looks like a not optimal fix as it breaks existing installations
without warning.


Kind regards
Christian
Comment 10 Eric Covener 2020-03-24 22:54:10 UTC
(In reply to chris@familie-hilgers.com from comment #9)
> Here is a reproducer to show the impact of the fix on a forward Proxy using
> ProxyRequests On

(and using mod_rewrite)

It seems like we could easily skip over strlen(r->parsed_uri.scheme) when the scheme was present.  Thoughts Joe/Ruediger?
Comment 11 Eric Covener 2020-03-24 23:25:21 UTC
(In reply to Eric Covener from comment #10)
> (In reply to chris@familie-hilgers.com from comment #9)
> > Here is a reproducer to show the impact of the fix on a forward Proxy using
> > ProxyRequests On
> 
> (and using mod_rewrite)
> 
> It seems like we could easily skip over strlen(r->parsed_uri.scheme) when
> the scheme was present.  Thoughts Joe/Ruediger?

Not so easy (even adding the  bytes for ://).  For a normal request, by the time we are here r->uri is already just the path anyway.
Comment 12 Eric Covener 2020-03-25 02:12:05 UTC
> For a normal request, by the
> time we are here r->uri is already just the path anyway.

My confusion was due to the behavior w/ and w/o proxy_detect() which runs in post_read_request:        

    r->uri = r->unparsed_uri;
Comment 13 Joe Orton 2020-03-25 11:38:07 UTC
We've had a small number of reports that the CVE-2019-0220 fix breaks previously working configurations, usually quite surprising or unusual configs which have simple workarounds.  Most often these are cases which do clearly need "MergeSlashes Off" configured, e.g. where an application embeds full URLs within the path of an http[s]:// URL.

The one given here is a surprising combination of forward and reverse proxy (RewriteRule .. [P]), and it is not obvious to me how we can address it w/o reverting the fix.

The string matched by mod_rewrite for a forward proxy configuration with the proxy: prefix is kind of an internal implementation detail - pseudo-URI - with the proxy: prefix - and behaves oddly.

For this configuration something like:

<ProxyMatch http://httpd.apache.org/>
  RewriteEngine On
  RewriteRule httpd.apache.org/(.*)  https://httpd.apache.org/$1  [L]
</ProxyMatch>

would seem more natural.