Summary: | org.apache.tomcat.util.net.TestSsl testPost() test failing on Windows | ||
---|---|---|---|
Product: | Tomcat 9 | Reporter: | Konstantin Kolinko <knst.kolinko> |
Component: | Catalina | Assignee: | Tomcat Developers Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | michaelo |
Priority: | P2 | ||
Version: | 9.0.x | ||
Target Milestone: | ----- | ||
Hardware: | PC | ||
OS: | All | ||
Attachments: |
01_TEST-org.apache.tomcat.util.net.TestSsl.NIO2.txt
02_TEST-org.apache.tomcat.util.net.TestSsl.NIO2.txt 03_TEST-org.apache.tomcat.util.net.TestSsl.APR.txt 04_TEST-org.apache.tomcat.util.net.TestSsl.NIO.txt |
Description
Konstantin Kolinko
2020-04-06 13:31:53 UTC
Created attachment 37147 [details]
01_TEST-org.apache.tomcat.util.net.TestSsl.NIO2.txt
Out of 4 NIO2 failures, this kind of failure occurred only once.
The error here:
Byte in position [10217184] had value [-1] rather than 1
Notes:
1) Message formatting may be improved.
- There is no LF after the message (it uses print() instead of println()(.
- There are no '[' ']' around '1'.
2) This error means that the servlet echoed less data than expected (10 Mb instead of 16 Mb), and completed it as a normal response, without any sign of an error.
Why?
Created attachment 37148 [details]
02_TEST-org.apache.tomcat.util.net.TestSsl.NIO2.txt
Out of 4 NIO2 failures, this kind of failure occurred in 3 runs.
The error here is:
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
at org.apache.tomcat.util.net.TestSsl$1.run(TestSsl.java:132)
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
at org.apache.tomcat.util.net.TestSsl$1.run(TestSsl.java:132)
The same read error occurs for several threads at once. The attached log file has 5 of them.
Do you imply what the testPost test does is incorrect ? The amount of allocated memory is as intended, and it's not too extreme (IMO). Maybe the memory could be preallocated using ByteArrayOutputStream(int size) ? I don't think it would make a big difference, this is only a test, not "best practices". The test does work for me [so far]. As with the rather equivalent HTTP/2 TLS test, it is meant to test large POSTs with TLS since that's what people often complain about. The test was included in 9.0.33 already (since there was a regression) and there are no relevant changes in 9.0.34 (only NIO got a minor change, but you are saying it works). Why did it work then ? Created attachment 37149 [details]
03_TEST-org.apache.tomcat.util.net.TestSsl.APR.txt
An APR failure. I observed it once.
javax.net.ssl.SSLException: bad record MAC
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1022)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
at org.apache.tomcat.util.net.TestSsl$1.run(TestSsl.java:132)
Caused by: javax.crypto.BadPaddingException: bad record MAC
at sun.security.ssl.InputRecord.decrypt(InputRecord.java:219)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1017)
... 4 more
It is a read error as well, but the error message is an odd one.
I see similar errors in other tests once in a while. Doesn't this simply mean tests are flaky since sockets are not 100% reliable? (In reply to Remy Maucherat from comment #3) > Do you imply what the testPost test does is incorrect ? I do not see any obvious flaw in the design. I think that many clients behave like that. The servlet design is a bit unusual for me, as Content-Length of the request is known in advance. I wonder whether "Connection reset" is caused by some timeout. The test runs for 9-10 seconds for me, and other tests in the testsuite complete in about 1 second. > The test does work for me [so far]. As with the rather equivalent HTTP/2 TLS > test, it is meant to test large POSTs with TLS since that's what people > often complain about. > > The test was included in 9.0.33 already (since there was a regression) and > there are no relevant changes in 9.0.34 (only NIO got a minor change, but > you are saying it works). Why did it work then ? The detailed logging of errors was added just recently, by Mark on April 2nd: SHA-1: f8da18d362b8a773afd557fdda0fe50a3748c90c * Failed test on Windows. Expand logging to aid debugging of failures. Rerunning the test: 1. Additional 10 runs with the same Java 8 (AdoptOpenJDK 8.0.242.08-hotspot), - APR failures: 2 - The same "bad record MAC" as earlier (Comment 4) - 03_*txt - NIO2 failures: 4 - The same "Connection reset" as earlier (Comment 2) - 02_*txt - NIO failures: 2 - "Byte in position...", similar to Comment 1 - 01_*txt 1. 20 runs with Java 11 (AdoptOpenJDK 11.0.6.10-hotspot): All runs completed successfully. No issues noted. Created attachment 37150 [details] 04_TEST-org.apache.tomcat.util.net.TestSsl.NIO.txt Java 8 (AdoptOpenJDK 8.0.242.08-hotspot), NIO The same "Byte in position" failure as in Comment 1 - 01_*.txt In this test run this error occurred several times: Byte in position [15393480] had value [-1] rather than 1Byte in position [14987496] had value [-1] rather than 1Byte in position [15173572] had value [-1] rather than 1Byte in position [14886000] had value [-1] rather than 1 (In reply to Konstantin Kolinko from comment #7) > Rerunning the test: 3. 20 runs with Java 14 (OpenJDK 14 GA (14.0.0)): All runs completed successfully. No issues noted. A defect from the testsuite is that it tests either JSSE or OpenSSL, not both [it would need two extra full runs to test it]. Personally, I always use OpenSSL rather than JSSE. It is a bit unexpected that things would work better with Java 11, so I wonder if you could also try OpenSSL. Adding this property to build.properties switches to it: test.sslImplementation=org.apache.tomcat.util.net.openssl.OpenSSLImplementation I'm wondering if this is a load issue. The test had a fairly low timeout on the server (3s) and I can imagine a situation where that timeout might get triggered if the host machine was under load. I've increased the timeout and I'll run some tests locally. I ran the test 20 times on Windows with Java 8 (Adopt OpenJDK 242 64-bit) and didn't see a single error. Once I dropped the connection timeout to 1500ms I started to see errors similar to those reported here. I've increased the default timeout on this test to 20s so this should be fixed. |