Summary: | Non-US-ASCII letters in url-mapping | ||
---|---|---|---|
Product: | Tomcat 8 | Reporter: | Martin Nybo Andersen <tweek> |
Component: | Util | Assignee: | Tomcat Developers Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 8.5.15 | ||
Target Milestone: | ---- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: | Servlet that logs url-mappings (maven project) |
Description
Martin Nybo Andersen
2017-07-27 08:06:45 UTC
The requirement the URL patterns in web.xml must be decoded dates back to Servlet 2.3 (see r285186). In more recent times this has been tweaked so the the charset used to do the decoding is consistent with the charset used for the web.xml file (see r1758423). However, the expectation from the Java EE XSD is that: <quote> This pattern is assumed to be in URL-decoded form and must not contain CR(#xD) or LF(#xA) </quote> The Servlet specification also references RFC 3986 although it doesn't offer a view on where that RFC applies and where it does not. Those do not appear to be entirely consistent. Given the above, it is also worth noting the rare edge cases where a literal '*' or '%' needs to be used in the url-pattern. So, where to go from here? My current thinking is that Tomcat needs to assume the url-patterns may be partially decoded. i.e. they may contain characters not permitted by RFC 3986 and they may also contain %nn sequences that need to be decoded. Therefore, r1793440 needs to be reverted / rewritten on that basis. I'm going to start work in this direction but if folks disagree with my analysis or think I have missed one or more important points, please do speak up. Interesting analysis. A servlet-mapping can be created by a tool. E.g. JspC: https://svn.apache.org/viewvc/tomcat/trunk/java/org/apache/jasper/JspC.java?revision=1800816&view=markup#l1092 o.a.j.JspC.generateWebMapping() Encoding of generated web.xml file is configurable ("-webxmlencoding" switch), but the pattern itself is simply written as > mappingout.write(file.replace('\\', '/')); If we are to require that url-mapping pattern is urlencoded, JspC should be adjusted for that. Hi Mark, If the 'pattern is assumed to be in URL-decoded form', why decode it again? Kind regards, Martin The requirement the URL patterns in web.xml must be decoded dates back to Servlet 2.3 (see r285186). Thanks for the report. This has been fixed in trunk (for 9.0.0.M26) and 8.5.x (for 8.5.20 onwards). |