Summary: | NIO2 connector cuts incoming request | ||
---|---|---|---|
Product: | Tomcat 8 | Reporter: | Markus Dörschmidt <markus.doerschmidt> |
Component: | Connectors | Assignee: | Tomcat Developers Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P2 | ||
Version: | 8.5.24 | ||
Target Milestone: | ---- | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: | Minimal code to reproduce bug |
Description
Markus Dörschmidt
2017-11-13 11:55:47 UTC
Do you think you could give some pointers on reproduction ? In the past there was BZ57799 which was caused by an unexpected interaction with use of available() by the framework. Yes, we are going to need some information on how to reproduce this. Created attachment 35539 [details]
Minimal code to reproduce bug
I attached a simple web application to reproduce the code. Try sending XML data to application using curl: curl -X POST -k --header "Content-Type: text/xml;charset=UTF-8" --data @"test.xml" <your-url-here> After calling curl for a 5MB XML file, I get this response: <!doctype html><html lang="en"><head><title>HTTP Status 500 ? Internal Server Error</title><style type="text/css">h1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} h2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} h3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} b {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} p {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;} a {color:black;} a.name {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 500 ? Internal Server Error</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Message</b> Premature end of file.</p><p><b>Description</b> The server encountered an unexpected condition that prevented it from fulfilling the request.</p><hr class="line" /><h3>Apache Tomcat</h3></body></html> Tomcat is configured to allow uploads up to 20MB. Current versions are: Tomcat 8.5.23, OpenSSL 1.0.2-k, Tomcat Native 1.2.14 This works for me. The real web application uses Spring Webservices 2.4.0. The SOAP request contains a base64-encoded binary element. The SOAP request never reaches the Webservice framework, because parsing the XML documents fails due to incomplete data. As I said, I tried the same connector with your test upload, it worked for me. This issue is not very well presented IMO, for instance it seems to imply it is a NIO2 + OpenSSL issue only, but in the end that's not very clear. Does it also cause issues for you with: - NIO ? - No SSL ? - JSSE rather than OpenSSL ? - Any XML file ? The document builder input stream read behavior would cause this (I'm pretty sure using a buffered IS would be a workaround), so it might not break with any file. - Why do I need to use a multi MB XML when it breaks after a few KB ? If you get a stack trace, please post it. In your test case, instead of the exception reporting used, you should probably use e.printStackTrace(), I can't reproduce this problem with the given test case either. I've tested: - NIO2 - 8.5.x trunk - http and https - https with OpenSSL (Tomcat native 1.2.16, OpenSSL ubuntu latest) - https with JSSE - Java 1.8.0u144 There are no obvious changes in the versions I am using compared to the versions you tested that might have fixed this issue. No further activity, and failed to reproduce. I reopened, because I have new information about how to reproduce the bug: The bug is reproducable with the litte web application I previously attached to this bug report. I got the bug in an environment with these conditions: 1. Server and client have to be in different networks with a gateway between 2. Tomcat needs NIO2 connector handling HTTPs When I send an XML file to "/xml" of my example application using CURL, I get an server-side error about a malformed XML document, when the XML exceeds some random size limit. When I send the same XML file to "/test", which simply reads and count read bytes, I get a client-side error: curl: (55) SSL_write() returned SYSCALL, errno = 104 The bug does not occur if: - protocol handled by NIO2 connector is HTTP - or client and server are in the same network Thanks for the additional information. I can now reproduce this (thanks to Microsoft for the free azure credits). I have two clean Tomcat 9.0.x builds. One locally, one on Azure. Key config is HTTPS / OpenSSL / NIO2. Locally the upload works. Uploading to Azure fails. Switching the Azure instance to NIO makes the problem go away. I'm starting to investigate now. I've found the root cause. A blocking read obtained bytes from the network but after unwrapping there were zero application bytes. Since this is a blocking read, more network bytes should be read and unwrapped until there are some application bytes. This wasn't happening. I have a fix that I'll apply shortly. I just need to check if it needs to be applied anywhere else as well. Fixed in: - trunk for 9.0.5 onwards - 8.5.x for 8.5.28 onwards Great job on that fix ! :) On second thought, this will need to be revisited as the blocking read of SecureNio2Channel is supposed to block until it returns non 0. Of course, the algorithm is complex already ... I will add a TODO comment in trunk about that. I added a follow up: http://svn.apache.org/viewvc?rev=1824201&view=rev As this didn't happen with the regular JSSE engine, I think the missing underflow status is the root cause (the unwrap loops in SecureNio(2)Channel will clearly misbehave if all input has just been consumed and no output has been produced while the returned status remains OK; they will then return 0 rather that do a network read). I have no idea why it seemingly didn't happen with NIO. It should also affect the non blocking/async read calls but obviously the blocking read issue is more common and noticeable. If the fix is correct then the NIO2 blocking read loop added in http://svn.apache.org/viewvc?rev=1823262&view=rev should not be needed. Thanks for the follow-up. I should have dug deeper for the root cause myself. I'll re-run the tests with the Azure instance and revert r1823262 if it is no longer necessary. Testing demonstrated that the work-around was no longer required. I have removed it. Thank you very much. The bug is fixed! |