|Summary:||CoyoteAdapter is breaking path info|
|Product:||Tomcat 5||Reporter:||John Trollinger <jakarta>|
|Component:||Connector:Coyote||Assignee:||Tomcat Developers Mailing List <dev>|
Description John Trollinger 2003-11-26 13:57:58 UTC
The CoyoteAdapter removes valid path info in the postParseRequest() method. take the following url http://localhost/appname/servlet-name/extra;path/info;here/hi.jsp when you call getPathInfo() on the request it returns /extra this is because the CoyoteAdapters postParseRequest() method finds the first ';' and truncates the path. According to RFC1738 this is a valid path. The class org.apache.tomcat.util.http.mapper.Mapper handles the path correctly but is never given the right path from the CoyoteAdapter. thanks, John
Comment 1 Remy Maucherat 2003-11-26 19:42:34 UTC
I do not agree with your analysis. ';' is a reserved character, and used to define parameters. Please do not reopen this report.
Comment 2 John Trollinger 2003-11-26 20:22:26 UTC
Remy, Yes the ; is for parameters, but parameters are allowed in the path and are valid. Why would it be correct behavior for tomcat to not send all the path info because the path had parameters in it? John
Comment 3 Remy Maucherat 2003-11-27 00:18:04 UTC
The actually relevant RFC is 2396. I think the parameters do not belong to any of the processed paths returned by Tomcat. However, I've read in the RFC that a parameter applies to a path segment (!) as separated by '/', so the unparameterized URL in your example should likely be: http://localhost/appname/servlet-name/extra/info/hi.jsp. I'm not to sure about my understanding of the RFC, and I don't plan to fix this issue (clean patch welcome, I suppose, though). If you want to use the parameters, you can get the original request URI, as usual.
Comment 4 John Trollinger 2003-12-03 17:43:24 UTC
Remy, Is the way tomcat is now spec complient? According to the spec PathInfo: The part of the request path that is not part of the Context Path or the Servlet Path. It is either null if there is no extra path, or is a string with a leading ‘/’. so clearly if my url was http://localhost/appname/servletname/extra;a/path;b/info;c then if /appname is my context and /servletname is my servlet path then /extra;a/path;b/info;c should be my path info according to the spec (see SRV.4.4 and table 2) so if I submit a patch for this would you except this behavior? John
Comment 5 Remy Maucherat 2003-12-03 17:56:26 UTC
I say no. What if there are path parameters elsewhere ? You remove them ? It doesn't make any sense.
Comment 6 John Trollinger 2003-12-03 18:04:58 UTC
I dont know how path parms are used normaly (personaly I have never seen them used except in our app) so I am not sure how they should be treated, but following the spec it seems the should be included in path info. I would like to get some feedback from others (hopefuly someone else out there uses them)
Comment 7 Remy Maucherat 2003-12-03 18:23:10 UTC
getRequestURI will return the full URI, right (including all the parameters) ? I think if you need them, it is more consistent to work from that. Since parameters will need to be stripped out for mapping (you find the path info only after the mapping is done, so you "can't" (= hard) add back the parameters reliably), I don't see how it can be made to work automagically.
Comment 8 John Trollinger 2003-12-03 18:26:56 UTC
Well the biggest problem right now is http://localhost/appname/servletname/extra;a/path;b/info;c then if /appname is my context and /servletname is my servlet path right now tomcat returns only /extra which is not right no matter how you look at it
Comment 9 John Trollinger 2003-12-04 16:05:52 UTC
Remy, I am trying to make a patch for this and have a question I hope you can answer. in the CoyoteAdapter the MessageBytes class is used. This class holds 2 arrays, one of char and one of byte. The CoyoteAdapter class modifies the byte array and not the char array so the data in the MessageBytes class is out of sync. Is this desired effect or should these arrays hold the same data. Looking at the comment on line 297 of CoyoteAdapter (rev. 1.14) it thinks that the jsessionid has been removed but it has only been removed from the byte not the char (from the methods MessageBytes.getByteChunk() and MessageBytes.getByteChunk()) I think the char and byte inside MessageBytes should be the same. any help would be great. also according the the spec requestURI = contextPath + servletPath + pathInfo so if the requestURI has the path parms the pathInfo HAS to john
Comment 10 Remy Maucherat 2003-12-04 17:28:26 UTC
I don't agree, they simply didn't think about path parameters (I didn't know you could put them in the middle of the path). Since path parameters won't be mapped, what if there are parameters in the servlet path ? I'd like to point out also that the session id parameter, if present, is always removed from the request URI. That could possibly imply that other parameters should also be removed everywhere. So I don't really agree with handling parameters at this time, because it's a mess. As for your question, at this time of the processing, the URI has been converted to chars, so use the CharChunk, as the current algorithm is doing.
Comment 11 John Trollinger 2003-12-04 17:45:21 UTC
Do you not want a patch for this at all, or do you want one that just rips out the parms and returns only the path?
Comment 12 John Trollinger 2003-12-04 17:49:37 UTC
> As for your question, at this time of the processing, the URI has been > converted to chars, so use the CharChunk, as the current algorithm is doing. Well as it is now the jsessionid is only being remove from the byte array (ByteChunk) the only reason you do not see the jsessionid in the path is the code finds the first ';' and drops it and everything after it.
Comment 13 Remy Maucherat 2003-12-04 17:54:25 UTC
At this point, removing path parameters seems to me the way to go, but I don't know ... I'll mark this as remind.
Comment 14 John Trollinger 2003-12-04 17:55:55 UTC
One last thing, as for you question about context and servlet path, having parms in them does not work at all. (it either does not find the context, or it does not find the servlet). How do you get clarification on the spec for these type of things (trust me when I say I dont know why anyone would use these either, but I inherited this code that does) John
Comment 15 Amy Roh 2003-12-10 22:40:24 UTC
Servlet spec folks talked about this (parameters in path and whether getpathinfo should return them or not), and they couldn't get the consensus. Most people seem to like that getPathInfo should NOT include the parameters, but we haven't had a thorough discussion and that's listed as an item for the next version of the spec. So, for now, it's container-specific but the servlet spec lead recommends to remove them.
Comment 16 Remy Maucherat 2003-12-11 18:39:18 UTC
Ok, thanks, but should this URI: /appname/servlet-name/extra;path/info;here/hi.jsp Become: /appname/servlet-name/extra/info/hi.jsp Or: /appname/servlet-name/extra I think the URI RFC (that nobody cares about in the HTTP world) specifies the first one :) Tomcat currently does the second one. I'm reopening the bug, as we'll get a resolution, apparently.
Comment 17 Amy Roh 2003-12-11 19:44:43 UTC
I don't think the spec has clearly resolved this issue. Right now, it somewhat implies that the return value of getPathInfo() include the things after ";". SRV.4.1 "Path parameters that are part of a GET request are not exposed by these APIs.(getParameter, etc.) They must be parsed from the String values returned by the getRequestURI method or the getPathInfo method." However, practically, the path parameter can be placed ANYWHERE in the URL and it is confusing that getPathInfo returns them. I think returning "/appnama/servlet-name/extra" is fine. As long as getRequestURI contains the path params, I don't think it violates the spec.
Comment 18 John Trollinger 2003-12-15 12:34:00 UTC
Should this work ? <servlet-mapping> <servlet-name>MyServlet</servlet-name> <url-pattern>/test/path/here</url-pattern> </servlet-mapping> invoke servlet -> http://localhost/context/test;a/path;b/here or is the url-pattern required to have the path parm in it (same with a context having a parm in it) because neither of these work now.
Comment 19 Remy Maucherat 2003-12-15 18:22:59 UTC
What Amy said implies: no, this doesn't work.
Comment 20 John Trollinger 2003-12-15 19:08:55 UTC
Amy, I am confused.. the spec says that I should be able to get the parm from the getRequestURI or getPathInfo. are you reading this as the contain must provide it in just one of these methods? It seems to me the spec is saying I should be able to retieve it from either method, not just one (which ever your vendor supports) SRV.4.1 "Path parameters that are part of a GET request are not exposed by these APIs.(getParameter, etc.) They must be parsed from the String values returned by the getRequestURI method or the getPathInfo method."
Comment 21 Amy Roh 2003-12-15 23:08:00 UTC
Two things: 1. getPathInfo should/shouldn't return ";blah" I think it's ok to leave it as it is for now since the spec is most likely going to be changed in the next version. I asked 154EG spec lead to consider removing the wording of "getPathInfo() method" in SRV.4.1 and he said they address this issue again and probably do remove as some members already mentioned that. 2. matching Should this work ? <servlet-mapping> <servlet-name>MyServlet</servlet-name> <url-pattern>/test/path/here</url-pattern> </servlet-mapping> invoke servlet -> http://localhost/context/test;a/path;b/here or is the url-pattern required to have the path parm in it (same with a context having a parm in it) because neither of these work now. ------------------------ For the above comment, SRV.11.1 says: "The path used for mapping to a servlet is the request URL from the request object minus the context path and the path parameters." That implies the deployment descriptor above should work and the <url-pattern> should not contain the path parameters. However, the spec doesn't talk about the case when the path params are within the context path. So, it's up to the container, and I would think that we can apply the same rule for the context path too.