Browsers won't escape '{|}' in parameters area. https://bugzilla.mozilla.org/show_bug.cgi?id=1451347 They state that they don't have to comply. This link "http://localhost:8090/jira/browse/SSP-15?jql=%20text%20=%20{ksi|delta}" is completely legit for firefox and chrome. Who should adjust?
Reading RFC 3986, the query part of the URL is the same as the rest, I don't understand the answer that it is different. I guess the assertion that the query string is different makes this worth being a separate BZ. Is RFC 3986 now out of date ? URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] hier-part = "//" authority path-abempty / path-absolute / path-rootless / path-empty path-abempty = *( "/" segment ) path-absolute = "/" [ segment-nz *( "/" segment ) ] path-noscheme = segment-nz-nc *( "/" segment ) path-rootless = segment-nz *( "/" segment ) path-empty = 0<pchar> segment = *pchar segment-nz = 1*pchar segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" ) ; non-zero-length segment without any colon ":" pchar = unreserved / pct-encoded / sub-delims / ":" / "@" query = *( pchar / "/" / "?" ) fragment = *( pchar / "/" / "?" ) pct-encoded = "%" HEXDIG HEXDIG unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
(In reply to Remy Maucherat from comment #1) > Reading RFC 3986, the query part of the URL is the same as the rest, I don't > understand the answer that it is different. I guess the assertion that the > query string is different makes this worth being a separate BZ. Is RFC 3986 > now out of date ? > ... No, RFC 3986 is the proper spec. The problem is that some browser vendors thought it would be ok to write their own spec.
No, RFC 3986 is not out of date. It appears that the browsers are working to a different spec: https://url.spec.whatwg.org/#query-state It is the cookie debacle all over again. Sigh. At least this time there appear to be some efforts to work to a spec - even if it isn't the RFC. The six characters where there appears to be a difference of opinion between the RFC 3986 and the spec the browsers are working to are: "\", "^", "`", "{", "|", "}" I guess we could consider (optionally) relaxing the query string parsing to allow these but I don't like it. We have RFCs for a reason.
Ok ! :( In that other spec, the query string is indeed different from the path, but the path itself may be a bit different from our validation as well. For path: The C0 control percent-encode set are the C0 controls and all code points greater than U+007E (~). The fragment percent-encode set is the C0 control percent-encode set and U+0020 SPACE, U+0022 ("), U+003C (<), U+003E (>), and U+0060 (`). The path percent-encode set is the fragment percent-encode set and U+0023 (#), U+003F (?), U+007B ({), and U+007D (}). For the query: "If byte is less than 0x21 (!), greater than 0x7E (~), or is 0x22 ("), 0x23 (#), 0x3C (<), or 0x3E (>), append byte, percent encoded, to url’s query" +1 to allow this validation optionally.
Reopening as a high priority enhancement.
There is tomcat.util.http.parser.HttpParser.requestTargetAllow={|} for 8.5.x but it was dropped for 9.0.x. However it ignores these characters possibly everywhere. There should be flag that just allows for these characters to be allowed in params query.
Created attachment 35856 [details] HttpParser patch According to my understanding on that URL spec wording, here's what the HttpParser could look like. Oddities include '\\'. A system property would likely be used to configure.
From memory, the previous parser was lenient to the extent that it only checked for characters that were not permitted anywhere in the request target rather than checking each part individually. My initial thoughts were something pluggable on the connector that delegated most of the work to the HttpParser which would almost certainly need some new methods. My thinking hasn't got much beyond that though. I'm planning on putting together something to map request target part, characters allowed and specification as I want to get a clearer understanding of what is and isn't allowed where.
Created attachment 35860 [details] URL patch No problem if you have bigger plans, here's the last version of my patch.
I've been spending some time looking at whatwg.org URL spec. It appears, from https://bugzilla.mozilla.org/show_bug.cgi?id=1451347 that the browsers consider any string that passes the URL parser as valid and that the result of that parsing is passed to the server. The rules for writing URLs are different but they don;t appear to apply in this case since if they did, everything we expect to be encoded in a query string would be encoded. I've been through the path parsing defined by the whatwg.org URL spec and I believe it is the same as RFC7230 / RFC3986. I've also been over the query part of the whatwg.org URL spec and I now believe there are 8 characters we expect to be encoded that the whatwg.org URL spec does not. They are: "[", "]", "\", "^", "`", "{", "|", "}" I still need to check the authority part.
Given the spec test, I thought the request target was a bit different, allowing unencoded '\\', '|' and '^'. Same conclusion for the query part.
That wasn't my reading but it isn't the easiest spec to read. It is much closer to an implementation description than a spec. I decided to do some testing to see what the current behaviour is. Results here: https://cwiki.apache.org/confluence/display/TOMCAT/Encoding+and+URIs It looks like tightening up the path parsing to limit '[' and ']' would cause problems. I don't see where the whatwg.org spec allows this though. I didn't see the automatic conversion of '\' to '/' when I read the spec either. The query string behaviour is close to what I expected although IE is worse. The authority part of the whatwg.org spec is very relaxed for the userinfo compared to RFC 7230 / RFC 3986 but since Tomcat just ignores the userinfo I'm not too concerned about that. Sigh. I'm still thinking about the best way to approach this. If we have configurable parser instances they are going to have to be per connector as the URI is parsed before it can be mapped. I'm wondering if it is worth it. Different settings per connector is likely to be unusual but on the other hand it isn't that hard to implement. One advantage of validating each part separately is that the query string may have to allow some characters we really don't want in the path like '\'.
Adding extra code to Tomcat to account for specification non-compliance of other components is the wrong solution. The right solution is to open bugs against the non-compliant components. Unfortunately, in this case, those other components are all the major browser vendors and they do not accept that their behaviour is incorrect. I have yet to see a convincing argument as to why the browsers should not implement RFC 7230 and RFC 3986. Working around the specification non-compliant browser behaviour just encourages vendors to continue to ignore specifications and leads to greater interoperability issues in the long term. However, the alternative is to break lots of applications for lots of users. Therefore, it is with regret that I have implemented this enhancement for all currently supported Tomcat versions. Fixed in: - trunk for 9.0.8 onwards - 8.5.x for 8.5.31 onwards - 8.0.x for 8.0.52 onwards - 7.0.x for 7.0.87 onwards
This is not acceptable to be put into a bugfix version. We use square brackets for tag filtering on query parameters, causing a lot of our queries to be suddenly bounced back with a 400. Luckily this only occurred to us on our test environment, but if we had built our production at any point since this was released, because we only pin the major and minor versions, we would have had a crisis. At worst it should be a minor, and considering that it breaks everything (as was already commented in this thread, every major browser does not adhere to this standard. In my own testing, cURL, Lynx, Node and Perl do not either!) it should probably be a major. We should not have to carefully monitor each bugfix version of the myriad software in our stack to see if basic functionality has changed.
(In reply to Trevor Robinson from comment #14) > This is not acceptable to be put into a bugfix version. Um... this should UN-BREAK things, not break things. I don't think you want to wait until Tomcat 10 for your brackets to start working again, do you?
*** Bug 62398 has been marked as a duplicate of this bug. ***
*** Bug 62415 has been marked as a duplicate of this bug. ***
(In reply to Christopher Schultz from comment #15) > Um... this should UN-BREAK things, not break things. I don't think you want > to wait until Tomcat 10 for your brackets to start working again, do you? Hmm, I think I misinterpreted your comment #13. So if I have this right, this patch implements an option called 'USE_URL_LIVING_STANDARD' that allows the previous parsing behavior to continue, right? Does it default to true? From my perspective, we didn't change anything about our Tomcat setup, it just suddenly started rejecting our URLs that included square brackets when we upgraded from 8.5.29 to 8.5.31. If it wasn't your intention to change the behavior, then I'm sorry for my initial comment's strong reaction.
what is now the conclusion of this topic? since the topic has been resolved(?). The change is still breaking and not aligned with what the major browser vendors do as far as I understand?
The status is: - Browser behaviour is inconsistent between browsers https://cwiki.apache.org/confluence/display/TOMCAT/Encoding+and+URIs - Browser behaviour appears to be inconsistent with the spec the browsers are working to although it isn't the easiest spec to read - Tomcat implements RFC7230 / RFC3986 by default - Users can configure relaxedPathChars and/or relaxedQueryChars on the Connector as required
are there any security implications when relaxing a char like [ and ]? Because there are other specifications like JSON API making heavy use these two characters and if so, Tomcat just becomes no option anymore for such use cases.
Still unusable, tried a number of changes but still fails to work due to the RFC change. Makes Tomcat worthless to our current and future usage, surely this change should be turned off until browsers make the change.
(In reply to Matthew Davie from comment #22) > Still unusable, tried a number of changes but still fails to work due to the > RFC change. Makes Tomcat worthless to our current and future usage, surely > this change should be turned off until browsers make the change. We can examine your urls vs the new configuration options on the user list.
(In reply to remmeier from comment #21) > are there any security implications when relaxing a char like [ and ]? > Because there are other specifications like JSON API making heavy use these > two characters and if so, Tomcat just becomes no option anymore for such use > cases. Surely, you aren't using JSON in your URLs and query strings, are you? Le's move this discussion to the users' mailing list.
no, just good to have that statement in terms of security for other people because this thing here causes a gigantic amount of issues since it affects in our case two dozen application with some beeing a bit larger and quite security sensitive. All that for something nobody else seem to care much, browser vendors included.
The Patch provided in the attachments has a USE_URL_LIVING_STANDARD switch implemented. However, when I look at the source code of my downloaded Tomcat 8.5.32, I can not see this change (and my webapp using [ ] in the path does not work anymore...). Am I missing something ? How should I be able to get the old behavior, where [ and ] are working as part of the URL ?
(In reply to Markus Schlegel from comment #26) > The Patch provided in the attachments has a USE_URL_LIVING_STANDARD switch > implemented. > However, when I look at the source code of my downloaded Tomcat 8.5.32, I > can not see this change (and my webapp using [ ] in the path does not work > anymore...). > Am I missing something ? How should I be able to get the old behavior, where > [ and ] are working as part of the URL ? Comment #20. Final bullet point. For full details see http://tomcat.apache.org/tomcat-8.5-doc/config/http.html#Standard_Implementation
Have a 3 hours remote debugging session behind me, where we had to check why our app is not working anymore on one specific server. It turned out, that the server had a reverse-proxy (NGINX) in front of our app server. NGINX decoded the URL and sent the [] in decoded form to the Tomcat instance. Thanks for that headache. That wouldn't be a problem for me if that change would have be a minor release change, e.g. 8.5.30 to 8.6.x. But honestly I do not expect such incompatibility changes in a patch release.
how to fix error rfc 7230 and rfc 3986 on tomcat 9 I want to Allow special character "é" and "è" in to the URL I have already tested this but it does not work: relaxedQueryChars="[]|{}^\`èé+"<>" relaxedPathChars="[]|{}^\`èé+"<>"
Please do not reopen bugs to ask a support question. The Tomcat user mailing list is the right place for that.
(In reply to Christopher Schultz from comment #15) > (In reply to Trevor Robinson from comment #14) > > This is not acceptable to be put into a bugfix version. > > Um... this should UN-BREAK things, not break things. I don't think you want > to wait until Tomcat 10 for your brackets to start working again, do you? Any progress going on to support square brackets in tomcat?
(In reply to Mark Thomas from comment #27) > (In reply to Markus Schlegel from comment #26) > > The Patch provided in the attachments has a USE_URL_LIVING_STANDARD switch > > implemented. > > However, when I look at the source code of my downloaded Tomcat 8.5.32, I > > can not see this change (and my webapp using [ ] in the path does not work > > anymore...). > > Am I missing something ? How should I be able to get the old behavior, where > > [ and ] are working as part of the URL ? > > Comment #20. Final bullet point. For full details see > http://tomcat.apache.org/tomcat-8.5-doc/config/http. > html#Standard_Implementation But this is hackable.
Any further questions should be addressed to the users mailing list. http://tomcat.apache.org/lists.html