Our high-volume, latency-sensitive application spends approximately 0.16% of cpu executing org.apache.tomcat.util.buf.ByteChunk.toString. We see only two direct callers: org.apache.tomcat.util.buf.MessageBytes.toString() and org.apache.tomcat.util.http.Parameters.processParameters(). Aggregate cost is split evenly between the two. Call data shows many different routes trigger both these paths. Examination of the specific strings being looked up shows: 1. 40% Redundant parsing of key fields on org.apache.catalina.connector.Request: method, queryString, requestURI, headerNames, content type. 2. 10% Redundant lookup/iteration of header names 3. 50% Redundant parsing of request parameters, triggered from multiple paths within the JSP (see stack traces below). Key portions of the stack traces are included below. This code can be substantially optimized by locally caching the translated fields. Local tests of modifications to the simplest case (Request.getMethod()) confirm improvement. org.apache.tomcat.util.buf.StringCache.toString org.apache.tomcat.util.buf.ByteChunk.toString org.apache.tomcat.util.buf.MessageBytes.toString org.apache.catalina.connector.Request.getMethod org.apache.catalina.connector.RequestFacade.getMethod javax.servlet.http.HttpServletRequestWrapper.getMethod javax.servlet.http.HttpServletRequestWrapper.getMethod org.apache.jsp.WEB_002dINF.<proprietary>.common_jsp._jspService org.apache.tomcat.util.buf.StringCache.toString org.apache.tomcat.util.buf.ByteChunk.toString org.apache.tomcat.util.http.Parameters.processParameters org.apache.tomcat.util.http.Parameters.processParameters org.apache.tomcat.util.http.Parameters.handleQueryParameters org.apache.catalina.core.ApplicationHttpRequest.mergeParameters org.apache.catalina.core.ApplicationHttpRequest.parseParameters org.apache.catalina.core.ApplicationHttpRequest.getParameter javax.servlet.ServletRequestWrapper.getParameter javax.servlet.ServletRequestWrapper.getParameter javax.servlet.jsp.el.ImplicitObjectELResolver$ScopeManager$7.getAttribute javax.servlet.jsp.el.ImplicitObjectELResolver$ScopeManager$7.getAttribute javax.servlet.jsp.el.ImplicitObjectELResolver$ScopeMap.get javax.el.MapELResolver.getValue org.apache.jasper.el.JasperELResolver.getValue org.apache.el.parser.AstValue.getValue org.apache.el.ValueExpressionImpl.getValue org.apache.jasper.el.JspValueExpression.getValue org.apache.jsp.WEB_002dINF.<proprietary>_jsp._jspx_meth_c_005fset_005f1 Ref: <c:set var = "featureName" value= "${ param.featureName }"/> org.apache.tomcat.util.buf.StringCache.toString org.apache.tomcat.util.buf.ByteChunk.toString org.apache.tomcat.util.http.Parameters.processParameters org.apache.tomcat.util.http.Parameters.processParameters org.apache.tomcat.util.http.Parameters.handleQueryParameters org.apache.catalina.connector.Request.parseParameters org.apache.catalina.connector.Request.getParameterNames org.apache.catalina.connector.Request.getParameterMap org.apache.catalina.connector.RequestFacade.getParameterMap <proprietary>Filter.doFilter()
Are you sure about the request parameters being parsed multiple times? They should only be parsed once per request. Is org.apache.tomcat.util.buf.StringCache not caching the Strings or is just that local caching has much lower latency (which wouldn't surprise me)? Caching the String in ByteChunk.toString() is an option. I did look at that when I last worked on this code but it raises all sorts of complications around cache invalidation as the byte array is directly accessible.
> Are you sure about the request parameters being parsed multiple times? They should only be parsed once per request. Yes - this appears to be triggered when an extra param is added by a JSP include, for example: <jsp:include page="/myExample.jsp" > <jsp:param name="type" value="image" /> <jsp:param name="index" value="${status.index}" /> </jsp:include> ApplicationDispatcher.wrapRequest is invoked and creates a new ApplicationHttpRequest with un-parsed parameters. If the included JSP ever checks the parameters, the full set of parameters is parsed: the original plus any new ones added by the <jsp:include>. Nested JSP includes can trigger reparsing once per nest. Relevant stack trace: org.apache.catalina.core.ApplicationDispatcher.org.apache.catalina.core.ApplicationDispatcher.doInclude org.apache.catalina.core.ApplicationDispatcher.doInclude org.apache.catalina.core.ApplicationDispatcher.include org.apache.jasper.runtime.JspRuntimeLibrary.include > Is org.apache.tomcat.util.buf.StringCache not caching the Strings or is just that local caching has much lower latency (which wouldn't surprise me)? Lower latency. 70% of StringCache.toString() is spent in org.apache.tomcat.util.buf.StringCache.findClosest() which does appear to function as intended. > Caching the String in ByteChunk.toString() is an option. I did look at that when I last worked on this code but it raises all sorts of complications around cache invalidation as the byte array is directly accessible. Makes sense. :( This may be easier for org.apache.coyote.Request, where a reference to MessageBytes could be replaced by a reference to String (possibly lazy-loaded).
Thanks for the additional info. I'll try and address each part in turn. 1. Caching the key fields is low cost as it is just a reference to an existing String. I think that is a better solution that trying to cache in MessageBytes/ByteChunk/CharChunk due to the cache invalidation complexities. I think we can do this quickly. 2. Could you provide examples. If there is any low-hanging fruit similar to1 that should be easy to address. 3. Ah. Got it. That is going to be a little trickier to address. I suspect some refactoring will be required as we have to take account of the parameter merging rules. Not sure how easy that will be. My plan at this stage is to address the low hanging fruit and then switch this to an enhancement (strictly this is all an enhancement) for the parameter refactoring.
A quick update: 1. getMethod() is already effectively cached due to use of MessageBytes.toStringType() getQueryString() could be cached but there is a potential performance issue. If applications parse the query string manually or use getParameter() and friends all is well. If an application uses both, using MessageBytes.toStringType() could trigger bytes -> String -> bytes if getQueryString() is called before getParameter() and friends. I'm minded to leave this one alone for now. requestURI(), contentType() and headerNames can all switch to MessageBytes.toStringType(). I'll do that shortly. 2. Waiting for examples. 3. Cannot reproduce. I've reviewed the code and manually debugged an included JSP. The original query string is parsed once and the additional query string is parsed once.
Simple things fixed in: - 11.0.x for 11.0.0-M17 onwards - 10.1.x for 10.1.19 onwards - 9.0.x for 9.0.86 onwards - 8.5.x for 8.5.99 onwards Switching to NEEDINFO for the remaining issues.
Thank you for the quick turnaround! Additional data: Referring to #1: > getMethod() is already effectively cached due to use of > MessageBytes.toStringType() Our data is clear that this is re-parsing each time. The call occurs at the beginning of each JSP at the very beginning of the generated _jspService() method: final java.lang.String _jspx_method = request.getMethod(); Referring to #2: the code triggering this is below. I can't clearly determine whether the slow enum comes from request.getHeaders() or request.getHeaderNames(). HttpServletRequest request = context.getRequest(); Enumeration<String> nameEnum = request.getHeaderNames(); while (nameEnum.hasMoreElements()) { String name = nameEnum.nextElement(); List<String> headerValues = new LinkedList<String>(); Enumeration<String> valueEnum = request.getHeaders(name); while (valueEnum.hasMoreElements()) { headerValues.add(valueEnum.nextElement()); } headerMap.put(name, Collections.unmodifiableList(headerValues)); } Referring to #3: > Cannot reproduce. I don't immediately see another trigger for this case, but will keep looking. The behavior occurs throughout our app but I'll also compare with some others.
Updating status
The caching for getMethod() isn't present in 9.0.81. It is present from 9.0.83 onwards. I've added caching of the header values as well as the names.
Regarding #1: apologies, I was able to track down an invokeinterface that confused our primary tool. Specifically, a class calling HttpServletRequest.getMethod() on an instance of HttpServletRequestWrapper that could wrap: a) an instance of org.apache.catalina.connector.Request, b) another HttpServletRequestWrapper, or c) one of multiple custom subclasses. I agree that the value of Request.getMethod() is cached on MessageBytes and I'll fix the issue for our application by implementing local caching on a custom subclass. Regarding #3: I suspect the same problem as #1, and am certain that the same custom class can be modified to solve this problem. No Tomcat issues beyond what you've already fixed. Sorry for the noise!
Thanks for the update. Marking this as fixed based on that update.
These changes reached prod and eliminated the respective profiles.