Bug 60594 - RFC 7230/3986 url requirement that prevents unencoded curly braces should be optional, since it breaks existing sites
Summary: RFC 7230/3986 url requirement that prevents unencoded curly braces should be ...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 7
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 7.0.73
Hardware: All All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
: 60616 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-01-17 02:33 UTC by Geoff Groskreutz
Modified: 2017-09-25 20:33 UTC (History)
1 user (show)



Attachments
patch proposal (2.18 KB, patch)
2017-01-27 18:42 UTC, Coty Sutherland
Details | Diff
whitelist patch proposal (2.07 KB, patch)
2017-01-27 20:47 UTC, Coty Sutherland
Details | Diff
whitelist proposal limiting characters with docs (2.87 KB, patch)
2017-01-30 14:54 UTC, Coty Sutherland
Details | Diff
Updated patch proposal including a warning message for characters that aren't allowed (3.86 KB, patch)
2017-01-31 16:53 UTC, Coty Sutherland
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Geoff Groskreutz 2017-01-17 02:33:11 UTC
Using the protocol="HTTP/1.1" connector (Coyote)

After upgrading a site to Tomcat 7.0.73 from 7.0.72 or from anything earlier, a url with an unencoded { or } (ie. http://my.com?filter={"search":"isvalid"} ), now returns a 400 error code and logs the following error message:

"INFO: Error parsing HTTP request header
 Note: further occurrences of HTTP header parsing errors will be logged at DEBUG level.
java.lang.IllegalArgumentException: Invalid character found in the request target. The valid characters are defined in RFC 7230 and RFC 3986"

Resolution:
Since this is a breaking change (aka regression failure), there should be an option to override and turn this off (still reporting the first occurrence as shown above), so that any existing site which experiences this can choose to ignore this failure and continue as before, so they can deal with changing their application at a later date, if they deem the security risk is appropriate.

Defaulting the option to true (to enable the check) is perfectly fine, as long as there is an option in a server and/or application config file to disable it, and proper documentation on it.

Either this, or you clearly state in the release notes of 7.0.73, exactly what will break, and recommend that users do not perform the Tomcat update, if they are not ready to change their applications to comply, but I think this would open up an even bigger can of worms.

Instead of just saying:
"Add additional checks for valid characters to the HTTP request line parsing so invalid request lines are rejected sooner. (markt)" - this tells us nothing about the impending doom we may face.

But, I would recommend just giving us the option to decide for ourselves.
Comment 1 Mark Thomas 2017-01-17 22:28:23 UTC
Given that using an unencoded '{' or '}' in a URL is contrary to the RFCs and that the fix that tightened the validation rules was in response to a security vulnerability (CVE-2016-6816) I think it is unlikely that an option will be introduced to make this validation optional.

It is quite likely that some sites could safely tolerate some characters. However, it is also likely that the 'safe' set of invalid characters will vary from site to site. That would therefore require a more complex configuration option than simply allowing or disallowing a fixed set of characters.

Those interested in proposing a patch should look at lines 74-78 of org.apache.tomcat.util.http.parser.HttpParser although I'll repeat I think it is unlikely such a patch would be accepted.

All that code is static which means configuration via system properties - something I'd prefer to see less of rather than more of in Tomcat.

For completeness, '|' seems to be another character that is fairly widely used in unecoded form when it should be encoded.

Finally, changes related conformance to the relevant RFCs and Java EE specifications are not treated as regressions. Therefore, I have moved this to an enhancement request.
Comment 2 Remy Maucherat 2017-01-20 13:21:45 UTC
*** Bug 60616 has been marked as a duplicate of this bug. ***
Comment 3 Coty Sutherland 2017-01-27 18:42:03 UTC
Created attachment 34684 [details]
patch proposal

In response to the numerous complaints on the users list I decided to give this a shot. I added a system property which contains a blacklist that's used for validation of request targets rather than the long if statement that was there. If a users needs to allow unencoded | characters then they can just remove it from the blacklist defined in the tomcat.util.http.parser.HttpParser.blacklist property.

If this looks good to everyone I can push it to whichever versions of tomcat we want to allow an option for.
Comment 4 Mark Thomas 2017-01-27 19:46:19 UTC
Allowing some of those (e.g. space) is extremely dangerous and should not be allowed under any circumstances.

I generally dislike configuration via system property. That said, making this per Connector will be significantly more invasive.

Any proposed patch needs to include documentation. That documentation needs to include a very large, very clear warning the deviating from the default is a security risk.

If this feature is implemented, I'd prefer to see the option to allow illegal characters limited to a much smaller sub-set.
Comment 5 Coty Sutherland 2017-01-27 20:11:09 UTC
(In reply to Mark Thomas from comment #4)
> I generally dislike configuration via system property. That said, making
> this per Connector will be significantly more invasive.

I agree on both points. The system property seemed to be the least invasive way to achieve the desired result.
 
> Any proposed patch needs to include documentation. That documentation needs
> to include a very large, very clear warning the deviating from the default
> is a security risk.

Also agreed. Where would that documentation go?
 
> If this feature is implemented, I'd prefer to see the option to allow
> illegal characters limited to a much smaller sub-set.

Other than space, which characters should absolutely be excluded in all cases? I can create a secondary list containing those and programmatically add them if a user tries to remove them from the blacklist.

Also, my initial patch used a whitelist instead of a blacklist so that the system property was either commented out by default, or contained a few characters that were the exception to the rule. I inversed it to a blacklist to remove some logic and make it perform better; do you think that a whitelist would work better here? I can provide that patch also.
Comment 6 eolivelli 2017-01-27 20:36:17 UTC
Hi, for my use cases I would like to have just a whitelist and let Tomcat handle all the RFC blacklisted chars automatically. In my case I had to whitelist curly braces and pipe.
Comment 7 Coty Sutherland 2017-01-27 20:47:04 UTC
Created attachment 34687 [details]
whitelist patch proposal

For reference, and so I don't accidentally delete it :)
Comment 8 Mark Thomas 2017-01-27 20:55:49 UTC
I think I prefer the whitelist option but I'd like to see it limited to - at this point - '{', '}' and '|'. Other characters can be considered on a case by case basis.

Documentation should go in the system properties section of the config docs although I'm still mulling over what a Connector config option might look like.
Comment 9 Coty Sutherland 2017-01-30 14:54:33 UTC
Created attachment 34694 [details]
whitelist proposal limiting characters with docs

OK, here's an updated whitelist patch restricting the characters that are accepted to '{', '}', and '|'. I also included documentation for the property.

Let me know if that works better for you :)
Comment 10 Mark Thomas 2017-01-31 10:11:54 UTC
Thanks for the updated patch. I like the overall design. Some detail comments:
- I think a different name is required. We might want to override other restrictions in the future. Maybe requestTargetAllow 
- The docs need to state which characters are valid in the allowed list
- What to do if some other invalid character is placed on the allowed list. Log a warning?
- I'm still undecided on whether this should be per connector configuration

We also need to decide which versions to add this to. I currently thinking:
- 7.0.x - yes
- 8.0.x - yes
- 8.5.x - maybe
- 9.0.x - no
Comment 11 eolivelli 2017-01-31 10:17:27 UTC
Please fix it in Tomcat 8.5.X too
Comment 12 Coty Sutherland 2017-01-31 16:53:23 UTC
Created attachment 34698 [details]
Updated patch proposal including a warning message for characters that aren't allowed

(In reply to Mark Thomas from comment #10)
> Thanks for the updated patch. I like the overall design. Some detail
> comments:

No problem.

> - I think a different name is required. We might want to override other
> restrictions in the future. Maybe requestTargetAllow

That makes sense.

> - The docs need to state which characters are valid in the allowed list

Agreed.

> - What to do if some other invalid character is placed on the allowed list.
> Log a warning?

I thought about that but since there isn't any logging there at the moment I let it go. I think it's a good idea to log a warning though, so I'll add that.

> - I'm still undecided on whether this should be per connector configuration

That would be nice, but I haven't dug into the code enough to be able to quickly provide a patch for it.
 
> We also need to decide which versions to add this to. I currently thinking:
> - 7.0.x - yes
> - 8.0.x - yes

+1

> - 8.5.x - maybe

I'd vote yes on adding the option to 8.5.x because the stable version is already out and the behavior has changed. We obviously don't want to continue allowing broken clients to work, but I don't think we can change this behavior in a stable version, as evidenced by the users list complaints :)

> - 9.0.x - no

+1

I also noticed that the property being parsed was including the quotes, so I changed the commented out example accordingly.
Comment 13 eolivelli 2017-02-06 15:02:06 UTC
Coty, the patch looks good to me, can you please add the following chars to the list of allowed characters ?

'\"' (double quote)
'#' (sharp)
'<' (left angle bracket)
'>' (right angle bracket)
'\\' (backslash)
'^' (accent)
'`' (accent)

I think in some case I would need the "space" too, but Mark remarked that is would be very dangerous
Comment 14 Mark Thomas 2017-02-06 15:09:06 UTC
You need to make a case for each of those to added to the potentially allowed list. Without any such justification, I am -1 on expanding it beyond the current three allowed characters.
Comment 15 eolivelli 2017-02-06 15:16:37 UTC
OK Mark at this moment I'm running a patch in production to make all the characters allowed.

I have evidence only on troubles for curly braches and pipe characters so the patch looks good for me.

I will wait for the release of this patch in an official 8.5.x Tomcat version and deploy it to production.

In case I need further characters a will create a new issue


Thank you
Comment 16 Remy Maucherat 2017-02-06 15:36:15 UTC
-1 as well for any additional characters. People who are that desperate to run into trouble can patch Tomcat easily.
Comment 17 Coty Sutherland 2017-02-06 16:23:52 UTC
OK, cool. So unless someone else objects to the patch as-is, I'll commit it to 7.0.x - 8.5.x shortly.
Comment 18 Coty Sutherland 2017-02-07 18:33:01 UTC
Fixed in:

- 8.5.x for 8.5.12 onwards
- 8.0.x for 8.0.42 onwards
- 7.0.x for 7.0.76 onwards
Comment 19 Lulseged Zerfu 2017-05-18 07:42:52 UTC
Hi

 We have found that we have problems with some characters that are not allowed in request URI and would like to know if any filter or valve can be applied to encode until clients get updated instead of responding with 400 Bad Request.

 We have millions of clients (both android and ios) that needs to follow the RFC but it will take time but until then there must be some work around that can be used.

BR
Lulseged
Comment 20 Mark Thomas 2017-05-18 08:40:54 UTC
(In reply to Lulseged Zerfu from comment #19)
> Hi
> 
>  We have found that we have problems with some characters that are not
> allowed in request URI and would like to know if any filter or valve can be
> applied to encode until clients get updated instead of responding with 400
> Bad Request.

No. The invalid request is rejected long before the execution reaches a Valve or Filter.

As described above, Tomcat 8.5.x and earlier have a configuration option to allow '{', '}' and '|'. If you want to add other characters to the possible whitelist values, you need to make a case for them.

>  We have millions of clients (both android and ios) that needs to follow the
> RFC but it will take time but until then there must be some work around that
> can be used.

Running Tomcat behind a more lenient reverse proxy that encodes the invalid characters before the request is passed to Tomcat is another solution. You should be aware that generally, and for the same reasons Tomcat tightened request target parsing, other web servers will head in the same direction as Tomcat over time and start rejecting these requests.
Comment 21 Coty Sutherland 2017-05-25 14:27:33 UTC
Can anyone see any adverse affects to adding angle brackets to the whitelist? I have a customer that is using unencoded angle brackets around their session IDs in the URL which they can't change at this point and the CVE fix broke their application. If there aren't any adverse affects I'll add them to the list for my distribution, and to tomcat if anyone else needs them.
Comment 22 Mark Thomas 2017-05-25 18:35:43 UTC
You mean '<' and '>' ?

There is always the risk that unexpected reverse proxy behaviour will trigger a CVE-2016-6816 like issue but that risks exists for any white-listed character that should really be encoded.

I don't see it affecting the URL parsing in Tomcat.

If the undecoded URL is used in any XML like output it is likely to break it. But any user that is using '<' and '>' will be facing that problem already.

They look to be higher risk in terms of breaking stuff, but not in a security sense.

+1 to your approach.
Comment 23 Coty Sutherland 2017-05-25 19:17:30 UTC
(In reply to Mark Thomas from comment #22)
> You mean '<' and '>' ?

Yes.
 
> There is always the risk that unexpected reverse proxy behaviour will
> trigger a CVE-2016-6816 like issue but that risks exists for any
> white-listed character that should really be encoded.
> 
> I don't see it affecting the URL parsing in Tomcat.
> 
> If the undecoded URL is used in any XML like output it is likely to break
> it. But any user that is using '<' and '>' will be facing that problem
> already.
> 
> They look to be higher risk in terms of breaking stuff, but not in a
> security sense.
> 
> +1 to your approach.

OK, cool. Would we want to add them to tomcat then? It's a small code change, so I have no problems with Fedora/RHEL diverging a bit here if we don't want them.
Comment 24 Lulseged Zerfu 2017-06-08 07:13:27 UTC
Hi

 A reverse proxy is not an option and I would like to make a case where we allow double quotes in request URLs as '{', '}' and '|' are allowed today by configuring:

tomcat.util.http.parser.HttpParser.requestTargetAllow="

How can I make this a case?

BR
Lulseged Zerfu
Comment 25 Mark Thomas 2017-06-08 08:32:52 UTC
I'm neutral on adding '<' and '>' as allowed options.

I think '"' is in the same category. i.e. there is the risk that unexpected reverse proxy behaviour will trigger a CVE-2016-6816 like issue, no parsing issues and likelihood of breakage if the URL is used in HTML or similar without escaping.
Comment 26 Lulseged Zerfu 2017-06-08 09:56:33 UTC
Hi

 We don't see anyway out when millions of terminals are not working and that tomcat restricted '"' from being a part of request URL.

 Terminals will not comply overnight but are starting to comply slowly. Therefore we need to allow '"' under some transitional period before totally disallow the '"' char in a request URL.

 Staying on tomcat version 8.0.36 still risky because CVE-2016-6816 can be triggered.

BR
Lulseged Zerfu
Comment 27 Lulseged Zerfu 2017-06-12 07:48:39 UTC
Hi

 Any comment if you will add '"' to allow in our request URL? Ta the end of the day we are taking the risk.

BR
Lulseged
Comment 28 Lulseged Zerfu 2017-06-21 06:39:31 UTC
Hi

 Let us know if tomcat will add the '"' as '{', '}' and '|' are added to let us continue using latest tomcat releases.

 Please let us know what you think.

BR
Lulseged
Comment 29 Jeff 2017-09-25 20:33:43 UTC
I would like to ask for the ^ character. I'm not sure how to make a case for this. Its kind of important for us because we have been using this to denote financial indexes (similar to yahoo finance) and we have a large number of client installs that would all have to change to enforce uri encoding.

This is basically holding up our migration to Tomcat.

I think it would be preferable if we could select whatever characters we want to override. Its our site and we are the ones responsible for the security and functionality. Every entity that uses Tomcat might need different characters for different reasons. It would be easier to transition if they had access to an override. Clearly the default should be to override nothing but some sites are going to need this or that character to transition.

I could ask to have our clients url encode everything but realistically that could take years to complete.

I would prefer that this exemption be extended rather that having to hack the code base on our own as security updates would be more timely.