Bug 61751 - NIO2 connector cuts incoming request
Summary: NIO2 connector cuts incoming request
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 8.5.24
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-13 11:55 UTC by Markus Dörschmidt
Modified: 2018-02-15 07:42 UTC (History)
0 users



Attachments
Minimal code to reproduce bug (1.48 KB, application/zip)
2017-11-20 12:12 UTC, Markus Dörschmidt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Dörschmidt 2017-11-13 11:55:47 UTC
Using the NIO2 connector "org.apache.coyote.http11.Http11Nio2Protocol" incoming requests are cut at some point.

The bug was first discovered in a web application handling SOAP requests. The SOAP implementation complains about malformed XML. When tracking down the error, I could see, that the XML content in the message body ends at some random point.

Tests show, that requests are not cut at the same point. Sometimes, request bodies are cut somewhere beyond 16kByte, sometime they are cut somwhere near 2kByte.

I could not determine some pattern here.

The connector is configured to use the OpennSSL implementation "org.apache.tomcat.util.net.openssl.OpenSSLImplementation".


Versions:
Tomcat: 8.5.16
Java: 1.8.0_144-b01
OpenSSL. 1.0.2h
Comment 1 Remy Maucherat 2017-11-16 12:56:27 UTC
Do you think you could give some pointers on reproduction ?
In the past there was BZ57799 which was caused by an unexpected interaction with use of available() by the framework.
Comment 2 Mark Thomas 2017-11-16 13:38:24 UTC
Yes, we are going to need some information on how to reproduce this.
Comment 3 Markus Dörschmidt 2017-11-20 12:12:52 UTC
Created attachment 35539 [details]
Minimal code to reproduce bug
Comment 4 Markus Dörschmidt 2017-11-20 12:17:51 UTC
I attached a simple web application to reproduce the code.

Try sending XML data to application using curl:
curl -X POST -k --header "Content-Type: text/xml;charset=UTF-8" --data @"test.xml"  <your-url-here>

After calling curl for a 5MB XML file, I get this response:
<!doctype html><html lang="en"><head><title>HTTP Status 500 ? Internal Server Error</title><style type="text/css">h1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} h2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} h3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} b {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} p {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;} a {color:black;} a.name {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 500 ? Internal Server Error</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Message</b> Premature end of file.</p><p><b>Description</b> The server encountered an unexpected condition that prevented it from fulfilling the request.</p><hr class="line" /><h3>Apache Tomcat</h3></body></html>

Tomcat is configured to allow uploads up to 20MB.

Current versions are: Tomcat 8.5.23, OpenSSL 1.0.2-k, Tomcat Native 1.2.14
Comment 5 Remy Maucherat 2017-11-20 13:54:16 UTC
This works for me.
Comment 6 Markus Dörschmidt 2017-11-20 13:56:39 UTC
The real web application uses Spring Webservices 2.4.0. 
The SOAP request contains a base64-encoded binary element.

The SOAP request never reaches the Webservice framework, because parsing the XML documents fails due to incomplete data.
Comment 7 Remy Maucherat 2017-11-21 07:03:52 UTC
As I said, I tried the same connector with your test upload, it worked for me.

This issue is not very well presented IMO, for instance it seems to imply it is a NIO2 + OpenSSL issue only, but in the end that's not very clear.
Does it also cause issues for you with:
- NIO ?
- No SSL ?
- JSSE rather than OpenSSL ?
- Any XML file ? The document builder input stream read behavior would cause this (I'm pretty sure using a buffered IS would be a workaround), so it might not break with any file.
- Why do I need to use a multi MB XML when it breaks after a few KB ?
If you get a stack trace, please post it. In your test case, instead of the exception reporting used, you should probably use e.printStackTrace(),
Comment 8 Mark Thomas 2017-11-21 21:09:37 UTC
I can't reproduce this problem with the given test case either. I've tested:
- NIO2
- 8.5.x trunk
- http and https
- https with OpenSSL (Tomcat native 1.2.16, OpenSSL ubuntu latest)
- https with JSSE
- Java 1.8.0u144

There are no obvious changes in the versions I am using compared to the versions you tested that might have fixed this issue.
Comment 9 Remy Maucherat 2018-01-17 08:07:30 UTC
No further activity, and failed to reproduce.
Comment 10 Markus Dörschmidt 2018-01-26 12:46:48 UTC
I reopened, because I have new information about how to reproduce the bug:

The bug is reproducable with the litte web application I previously attached to this bug report.

I got the bug in an environment with these conditions:
1. Server and client have to be in different networks with a gateway between
2. Tomcat needs NIO2 connector handling HTTPs

When I send an XML file to "/xml" of my example application using CURL, I get an server-side error about a malformed XML document, when the XML exceeds some random size limit.

When I send the same XML file to "/test", which simply reads and count read bytes, I get a client-side error:

curl: (55) SSL_write() returned SYSCALL, errno = 104


The bug does not occur if:
- protocol handled by NIO2 connector is HTTP
- or client and server are in the same network
Comment 11 Mark Thomas 2018-02-05 14:03:11 UTC
Thanks for the additional information. I can now reproduce this (thanks to Microsoft for the free azure credits).

I have two clean Tomcat 9.0.x builds. One locally, one on Azure. Key config is HTTPS / OpenSSL / NIO2. Locally the upload works. Uploading to Azure fails. Switching the Azure instance to NIO makes the problem go away.

I'm starting to investigate now.
Comment 12 Mark Thomas 2018-02-05 23:01:40 UTC
I've found the root cause. A blocking read obtained bytes from the network but after unwrapping there were zero application bytes. Since this is a blocking read, more network bytes should be read and unwrapped until there are some application bytes. This wasn't happening. I have a fix that I'll apply shortly. I just need to check if it needs to be applied anywhere else as well.
Comment 13 Mark Thomas 2018-02-05 23:35:21 UTC
Fixed in:
- trunk for 9.0.5 onwards
- 8.5.x for 8.5.28 onwards
Comment 14 Remy Maucherat 2018-02-06 05:37:00 UTC
Great job on that fix ! :)
Comment 15 Remy Maucherat 2018-02-06 08:45:58 UTC
On second thought, this will need to be revisited as the blocking read of SecureNio2Channel is supposed to block until it returns non 0. Of course, the algorithm is complex already ...
I will add a TODO comment in trunk about that.
Comment 16 Remy Maucherat 2018-02-14 08:48:59 UTC
I added a follow up: http://svn.apache.org/viewvc?rev=1824201&view=rev

As this didn't happen with the regular JSSE engine, I think the missing underflow status is the root cause (the unwrap loops in SecureNio(2)Channel will clearly misbehave if all input has just been consumed and no output has been produced while the returned status remains OK; they will then return 0 rather that do a network read). I have no idea why it seemingly didn't happen with NIO. It should also affect the non blocking/async read calls but obviously the blocking read issue is more common and noticeable.

If the fix is correct then the NIO2 blocking read loop added in http://svn.apache.org/viewvc?rev=1823262&view=rev should not be needed.
Comment 17 Mark Thomas 2018-02-14 09:11:29 UTC
Thanks for the follow-up. I should have dug deeper for the root cause myself. I'll re-run the tests with the Azure instance and revert r1823262 if it is no longer necessary.
Comment 18 Mark Thomas 2018-02-14 12:00:38 UTC
Testing demonstrated that the work-around was no longer required. I have removed it.
Comment 19 Markus Dörschmidt 2018-02-15 07:42:35 UTC
Thank you very much. The bug is fixed!