Bug 68089 - ApplicationHttpRequest.getSpecial() and removeSpecial() use linear scans
Summary: ApplicationHttpRequest.getSpecial() and removeSpecial() use linear scans
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 9.0.x
Hardware: All All
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-06 20:02 UTC by John Engebretson
Modified: 2024-03-06 21:56 UTC (History)
1 user (show)



Attachments
Speed test (9.62 KB, text/plain)
2024-02-07 19:15 UTC, John Engebretson
Details
Support class for the speed test (1.48 KB, text/plain)
2024-02-07 19:15 UTC, John Engebretson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Engebretson 2023-11-06 20:02:33 UTC
A high-volume, latency-sensitive application shows a large amount of String.equals() calls buried inside tomcat processing.  Detailed examination shows that much of it comes from ApplicationHttpRequest.getSpecial() and removeSpecial(), which are called once per call to request.setAttribute() or request.removeAttribute().  In our application these calls are from certain Filters or, more commonly, AstExpressions, especially those which are not otherwise resolved.

The relevant code is:
    protected static final String specials[] =
            { ... };
    protected int getSpecial(String name) {
        for (int i = 0; i < specials.length; i++) {
            if (specials[i].equals(name)) {
                return i;
            }
        }
        return -1;
    }

The array contains 12 specials, so each call to request.getAttribute(), etc. performs 12 String comparisons.  The occasional search of an actual special will reduce that, but very few of our calls match.

Suggestion: use a HashMap<String, Integer> or enums to achieve a similar purpose.  Comparable performance data may be found at https://bz.apache.org/bugzilla/show_bug.cgi?id=67080.

These scans are largely inlined into calling methods so I cannot reliably estimate the cpu or clock impact, however the sheer number of calls is impressive.
Comment 1 Remy Maucherat 2023-11-07 09:04:03 UTC
> These scans are largely inlined into calling methods so I cannot reliably
> estimate the cpu or clock impact, however the sheer number of calls is
> impressive.

I understand the concern, but it sounds doubtful this is actually a problem given the nature of the checks. Please provide actual data, also demonstrating the proposed technique would provide any performance benefits.
Comment 2 Mark Thomas 2023-11-07 12:55:12 UTC
A crude performance test shows that using a map makes the request attribute lookup about 2.5x faster.

I should have a patch for this once I have got my head around how AttributeNamesEnumerator works.
Comment 3 John Engebretson 2023-11-07 15:51:17 UTC
Good idea to measure ApplicationHttpRequest.getAttribute()!  Our production environment shows 0.27% cpu spent in that method, of which 0.1% cpu is spent in method-local time (including the inlined array scan).

ApplicationHttpRequest.setAttribute and removeAttribute() are also present but 10x smaller.
Comment 4 Mark Thomas 2023-11-07 16:43:43 UTC
Fixed in:
- 11.0.x for 11.0.0-M14 onwards
- 10.1.x for 10.1.16 onwards
-  9.0.x for  9.0.83 onwards
-  8.5.x for  8.5.96 onwards
Comment 5 John Engebretson 2023-11-13 20:25:52 UTC
For posterity: this issue also dominates the runtime of org.apache.catalina.core.ApplicationHttpRequest$AttributeNamesEnumerator.findNext().
Comment 6 John Engebretson 2024-02-07 19:14:55 UTC
Production data confirms the behavior change but shows minimal actual performance improvement - it appears the extra work of hashing the string and doing the HashMap lookup is comparable to the wasted effort of 12 string comparisons.

I refined the approach by checking whether the attribute name was shorter than the shortest special (length 30) and that eliminates the search cost for the vast majority of calls.  The attached speed test compares the cost of searching for attributes with long vs. short names, and the current implementation of getAttribute() is indeed slightly faster for short names (520ms vs. 540ms on my PC).

With the length check, the short names are orders of magnitudes faster.  Presumably the JIT optimized it close to a no-op.

Checking for length 30 is an easy modification, but not the most maintainable.  Perhaps modify the static initializer to calculate the length of the shortest special?
Comment 7 John Engebretson 2024-02-07 19:15:20 UTC
Created attachment 39574 [details]
Speed test
Comment 8 John Engebretson 2024-02-07 19:15:39 UTC
Created attachment 39575 [details]
Support class for the speed test
Comment 9 Mark Thomas 2024-02-08 14:37:26 UTC
I did look at using !startsWith("jakarta") but that is ~2x slower than a length check.

I've coded up the length check and am just running some tests locally.
Comment 10 Mark Thomas 2024-02-12 14:11:58 UTC
Fixed in:
- 11.0.x for 11.0.0-M17 onwards
- 10.1.x for 10.1.19 onwards
-  9.0.x for  9.0.86 onwards
-  8.5.x for  8.5.99 onwards
Comment 11 Christopher Schultz 2024-02-12 14:57:30 UTC
(In reply to Mark Thomas from comment #9)
> I did look at using !startsWith("jakarta") but that is ~2x slower than a
> length check.
> 
> I've coded up the length check and am just running some tests locally.

What about charAt(0) == 'j'?

I suppose that's the second thing String.equals() will do after checking the length, so maybe it won't help at all.
Comment 12 John Engebretson 2024-02-12 15:00:56 UTC
String compaction made charAt() more expensive, thanks to checks on format and range.  String.length() remains an integer check.

That said, charAt() isn't that bad, so it's an option if functionally required.
Comment 13 John Engebretson 2024-03-06 21:56:24 UTC
This changed reached prod and improved performance.  Low impact.