Bug 64974

Summary: Tomcat losing HTTP pipeline requests if asking for available bytes
Product: Tomcat 9 Reporter: Sergey Mashkov <cy6erGn0m>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P2    
Version: 9.0.38   
Target Milestone: -----   
Hardware: PC   
OS: Mac OS X 10.1   
Attachments: Just before the fill will discard requests (in debugger)

Description Sergey Mashkov 2020-12-10 15:41:40 UTC
Created attachment 37599 [details]
Just before the fill will discard requests (in debugger)

Starting from 9.0.39 Tomcat server does lose pipelined HTTP requests.

The scenario is the following:

1. A client connects and send a single packet of two HTTP requests
2. The server accepts connection, reads the bytes, parses the first request, does aux tasks, and notifies a servlet by calling the service method. At this point, Http1InputBuffer contains both requests, but the first is processed, so the position points to the beginning of the second request (that is not yet processed/parsed).
3. The called servlet does start async, start a new thread, and then does setup read listener.
4. setReadListener function invocation leads to the following call chain: setReadListener -> CoyoteInputStream.setReadListener -> InputBuffer.setReadListener -> InputBuffer.isReady -> Request.action (NB_READ_INTEREST) -> AbstractProcessor.action -> AbstractProcessor.isReadyForRead -> Http11Processor.available -> Http11InputBufer.available -> Http11InputBuffer.fill

The fill function checks for the "parsingHeader" flag and goes to the alternative branch. The byte buffer still contains both requests, and the position is pointing to the beginning of the second request. However, the alternative branch does change both position and limit to the same value losing the right end position. After that, the fill function does several preparation and invokes socket.read that overwrites the second request or read nothing but the second request is lost forever because the end position was lost.

See the attached screenshot of this moment. 

It looks like the code lading to the issue was there for a long time, but due to some changes in 9.0.39, it started to happen. But for sure, the fill function shouldn't simply discard the limit position but should append bytes after the limit regardless of the "end" value that points to the last parsed request that is not correct.
Comment 1 Remy Maucherat 2020-12-10 16:02:16 UTC
What is the point trying to find edge cases ? You have no content for these kind of requests, so it is never going to do anything useful.
Comment 2 Sergey Mashkov 2020-12-10 16:45:42 UTC
I don't see the point. " You have no content for these kind of requests" - there are two valid requests. Two valid requests should produce two responses. Or we should drop the connection after processing the first of them. Silent discarding any of them leads to breaking HTTP pipeline and all upcoming requests will have wrong responses.
Comment 3 Sergey Mashkov 2020-12-10 16:54:36 UTC
There is no such restriction to send requests separately. A client may send requests at any time and shouldn't wait for responses before sending the next: this is the idea of HTTP pipelining. Also, TCP fragmentation/defragmentation is out of our control. Therefore, two requests in two network packets could be merged into a single one at some point on the way to the server.
Comment 4 Mark Thomas 2020-12-11 13:14:06 UTC
I have a test case that replicates this now and some ideas for a potential fix.

Combining async and pipelining does seem an odd thing to do but I don't see any spec language that would prevent it.
Comment 5 Remy Maucherat 2020-12-11 14:23:29 UTC
(In reply to Mark Thomas from comment #4)
> I have a test case that replicates this now and some ideas for a potential
> fix.
> 
> Combining async and pipelining does seem an odd thing to do but I don't see
> any spec language that would prevent it.

It still doesn't make sense to me and I still don't see how it really happens with input that means something and an application that doesn't actively tries to mess up its input.

I think the code in Http11InputBuffer.available(boolean read) could likely use InputFilter.isFinished to avoid the edge case.
Comment 6 Mark Thomas 2020-12-11 14:39:04 UTC
Both isFinished() and available() appear to have problems differentiating between current request data and pipelined request data. I thought I had a trivial fix but it is breaking other tests. Still working on it.
Comment 7 Mark Thomas 2020-12-11 16:13:32 UTC
Fixed in:
- 10.0.x for 10.0.1 onwards
- 9.0.x for 9.0.42 onwards
- 8.5.x for 8.5.62 onwards