Bug 66627 - Regression due to MessageBytes refactoring
Summary: Regression due to MessageBytes refactoring
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Util (show other bugs)
Version: 9.0.71
Hardware: All All
: P2 regression (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-07 04:06 UTC by WJCarpenter
Modified: 2023-06-08 15:37 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description WJCarpenter 2023-06-07 04:06:55 UTC
This commit, https://github.com/apache/tomcat/commit/10a1a6d46d952bab4dfde44c3c0de12b0330da79, that appeared in 9.0.71 made changes to MessageBytes.toString() that cause a serious regression under some circumstances. This is preventing us from upgrading to later Tomcat releases to pick up security fixes.

If the MessageByte object is of type T_BYTES or T_CHARS, it gets converted to type T_STR. Although there is nothing in the general contract of toString() that prohibits modifying the object, it's an unexpected side-effect. The JavaDoc for MessageBytes.getType() explicitly says it returns "the type of the original content", so the change breaks that contract.

But it's more serious than a mere disagreement with documentation. Some colleagues of mine were a few days ahead of me in investigating this problem (they were working with tomcat-embed-core and I was working woth Tomcat standalone). Most of this explanation is due to their research.

If something calls MessageBytes.toString(), fragile assumptions elsewhere can fall apart. For example, if you use a Java agent like OpenTelemetry, it intercepts every request in order to obtain the URL path for logging. CoyoteAdapter.postParseRequest() then makes a mistake because the MessageBytes object changed from type T_BYTES to type T_STR, and this assumption is no longer true:

/*
 * The URI is chars or String, and has been sent using an in-memory protocol handler. The following
 * assumptions are made: - req.requestURI() has been set to the 'original' non-decoded, non-normalized
 * URI - req.decodedURI() has been set to the decoded, normalized form of req.requestURI()
 */

The downstream result of that is a 404 response for every URL path.

Experimentally, I found that simply removing the type reassignment, as seen here, was enough to resolve the immediate issue. State tracking in the MessageBytes class is a bit complicated, and I have not looked carefully to see if this proposed fix has any undesired consequences.

    /**
     * Compute the string value.
     * @return the string
     */
    @Override
    public String toString() {
        switch (type) {
            case T_NULL:
            case T_STR:
                // No conversion required
                break;
            case T_BYTES:
                //type = T_STR;
                strValue = byteC.toString();
                break;
            case T_CHARS:
                //type = T_STR;
                strValue = charC.toString();
                break;
        }

        return strValue;
    }
Comment 1 Mark Thomas 2023-06-07 16:29:23 UTC
The assumption in CoyoteAdater is still valid - assuming no-one is calling Tomcat internals. As soon as components start accessing Tomcat internals then all sorts of things can go wrong unless those components do so very carefully.

Open Telemetry fixed this 9 months ago:
https://github.com/open-telemetry/opentelemetry-java-instrumentation/commit/e526338fc3e0a172b74f0ced5b76cc47947bea88

It is a fair point about the change in behaviour of getType().

Not changing the type on a conversion shouldn't have any functional impact but that needs checking and it might have performance impacts.
Comment 2 WJCarpenter 2023-06-07 16:31:20 UTC
I agree in general with the point about calling internals, but it is, after all, a documented API.

Thanks for the pointer about OpenTelemetry's fix. We'll pursue that.
Comment 3 Mark Thomas 2023-06-07 16:38:46 UTC
The API is also documented not to be stable. See the section on API stability in the RELEASE-NOTES.txt file that should be at the root of every Tomcat 9 distribution.

That said, I do plan to look at the feasibility of restoring the behaviour described in the getType() Javadoc.
Comment 4 Mark Thomas 2023-06-08 12:59:07 UTC
Fixed in:
- 11.0.x for 11.0.0-M8 onwards
- 10.1.x for 10.1.11 onwards
-  9.0.x for  9.0.77 onwards
-  8.5.x for  8.5.91 onwards
Comment 5 WJCarpenter 2023-06-08 15:37:15 UTC
Thanks for the quick turnaround, Mark!