Bug 65785 - HTTP/2.0 non US-ASCII header names should be rejected
Summary: HTTP/2.0 non US-ASCII header names should be rejected
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 9.0.x
Hardware: Macintosh Mac OS X 10.1
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-06 11:35 UTC by Nils R
Modified: 2022-01-08 11:02 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nils R 2022-01-06 11:35:45 UTC
Issue summary
=============

Tomcat does not follows HTTP/2.0 header name specification : Header names should be US-ASCII encoded but Tomcat : 
- does not check their encoding, 
- allows non US-ASCII encoded
- corrupts non US-ASCII characters by suffixing them with "0xff", example : "0xf0" -> "0xf0 0xff"

Expected behaviour would be : to reject the HTTP/2.0 request as ill-formatted with a HTTP 400 error code.

The specifications
==================

HTTP/2.0
--------

The HTTP/2.0 specification (https://datatracker.ietf.org/doc/html/rfc7540#section-8.1.2) says: 
> Just as in HTTP/1.x, header field names are strings of ASCII
> characters that are compared in a case-insensitive fashion.  However,
> header field names MUST be converted to lowercase prior to their
> encoding in HTTP/2.  A request or response containing uppercase
> header field names MUST be treated as malformed (Section 8.1.2.6).

HTTP/1.1
--------

The HTTP/1.1 specification () says : 
> A recipient MUST parse an HTTP message as a sequence of octets in an
> encoding that is a superset of US-ASCII [USASCII].  Parsing an HTTP
> message as a stream of Unicode characters, without regard for the
> specific encoding, creates security vulnerabilities due to the
> varying ways that string processing libraries handle invalid
> multibyte character sequences that contain the octet LF (%x0A).

HPACK
-----

HPACK specification (https://www.rfc-editor.org/rfc/rfc7541.html#section-1.1) says:
> The format defined in this specification treats a list of header
> fields as an ordered collection of name-value pairs that can include
> duplicate pairs.  Names and values are considered to be opaque
> sequences of octets, and the order of header fields is preserved
> after being compressed and decompressed.


Problem description
===================

Tomcat does not reject non-ascii HTTP/2.0 header names, and its HPACK implementation casts the received bytes into chars so that "0xf0" becomes "0xf0 0xff".
It looks like the HPACK algorithm corrupts the header name, and the HTTP/2.0 implementation is then not able to reject this invalid header value (an US-ASCII character is coded with 7 bits, so the first bit MUST be always 0 and obviously "0xf0" has its first bit to 1).

As seen in the previous parts, HPACK algorithm should treat its input as "opaque sequence of octets" and thus should not try to convert them to String directly without knowing their encoding.
Then the HTTP/2.0 implementation should verify that the header names bytes are using only 7 bits (and thus can be safely decoded as ASCII characters).


HTTP/1.1 comparison
===================

Tomcat handles an invalid HTTP/1.1 header correctly, returning a HTTP 400 with this message : "The HTTP header line [0xf0: aa] does not conform to RFC 7230 and has been ignored."


Comparison with other products
==============================

- Netty (tested with 4.1.72) handles it badly too, but the header name "0xf0" is corrupted into "0x00 0xf0" (which is different from what tomcat does : "0xf0 0xff")

Reproducer
==========
- A fresh install of Tomcat (tested with 9 but I guess it will work out with any version of tomcat handling HTTP/2.0)
- the HTTP/2.0 connector configured (`<UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol" />`)
- A simple servlet
- run this command : `$ curl -v http://localhost:8080/static --http2-prior-knowledge -H "😱: aa"`
The request should be rejected with HTTP 400 error because the header name is not US-ASCII encoded.
Comment 1 Mark Thomas 2022-01-07 23:08:43 UTC
Fixed in:
- 10.1.x for 10.1.0-M9 onwards
- 10.0.x for 10.0.15 onwards
- 9.0.x for 9.0.57 onwards
- 8.5.x for 8.5.74 onwards
Comment 2 Nils R 2022-01-08 11:02:29 UTC
Thanks a lot for this quick answer (and fix !)

Since I had hard time to find the changes in github an in case someone is interested in reading it, here is the associated commit : https://github.com/apache/tomcat/commit/d909c709b639e9670edce2581293afb9626d7b5e