Bug 62626 - Tomcat 9.0.10 APR/Native crashes
Summary: Tomcat 9.0.10 APR/Native crashes
Status: RESOLVED FIXED
Alias: None
Product: Tomcat Native
Classification: Unclassified
Component: Library (show other bugs)
Version: 1.2.17
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-15 07:44 UTC by jan.pfeifer
Modified: 2020-08-20 15:48 UTC (History)
2 users (show)



Attachments
ZIP file - contents 8 plain text log files (66.73 KB, application/octet-stream)
2019-08-22 17:28 UTC, Conrad T. Pino
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jan.pfeifer 2018-08-15 07:44:30 UTC
Hello,

since upgrade to Tomcat 9 and switching to APR Tomcat regularly crashes.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00000001800e6fc9, pid=19764, tid=51640
#
# JRE version: Java(TM) SE Runtime Environment (10.0.2+13) (build 10.0.2+13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0.2+13, mixed mode, tiered, compressed oops, g1 gc, windows-amd64)
# Problematic frame:
# C  [tcnative-1.dll+0xe6fc9]
#
# Core dump will be written. Default location: C:\Program Files\Apache Tomcat 9.0 Shop 4\hs_err_pid19764.mdmp
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  S U M M A R Y ------------

Command Line: -Dcatalina.home=C:\Program Files\Apache Tomcat 9.0 Shop 4 -Dcatalina.base=C:\Program Files\Apache Tomcat 9.0 Shop 4 -Djava.io.tmpdir=C:\Program Files\Apache Tomcat 9.0 Shop 4\temp -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xms512M -Xmx4500M -Duser.language=cs -Duser.region=CZ -Djava.util.logging.config.file=C:\Program Files\Apache Tomcat 9.0 Shop 4\conf\logging.properties --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED --add-modules=java.se.ee exit abort -Xms515m -Xmx4000m 

Host: Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz, 8 cores, 15G,  Windows Server 2012 , 64 bit Build 9200 (6.2.9200.16384)
Time: Wed Aug 15 08:56:53 2018 Central Europe Daylight Time elapsed time: 37881 seconds (0d 10h 31m 21s)

---------------  T H R E A D  ---------------

Current thread (0x0000000033149800):  JavaThread "https-openssl-apr-443-exec-132" daemon [_thread_in_native, id=51640, stack(0x00000000396e0000,0x00000000397e0000)]

Stack: [0x00000000396e0000,0x00000000397e0000],  sp=0x00000000397dea50,  free space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [tcnative-1.dll+0xe6fc9]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 15519  org.apache.tomcat.jni.SSLSocket.handshake(J)I (0 bytes) @ 0x0000000015a94401 [0x0000000015a943c0+0x0000000000000041]
J 20374 c2 org.apache.tomcat.util.net.AprEndpoint.setSocketOptions(Lorg/apache/tomcat/util/net/SocketWrapperBase;)Z (347 bytes) @ 0x00000000165e540c [0x00000000165e5320+0x00000000000000ec]
J 20186 c2 org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run()V (199 bytes) @ 0x000000001657f10c [0x000000001657f040+0x00000000000000cc]
J 20687 c2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V java.base@10.0.2 (187 bytes) @ 0x00000000166e3980 [0x00000000166e37a0+0x00000000000001e0]
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@10.0.2
j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
J 20564 c2 java.lang.Thread.run()V java.base@10.0.2 (17 bytes) @ 0x0000000016507024 [0x0000000016506fc0+0x0000000000000064]
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), reading address 0x00000000000000f0 
...
---------------  S Y S T E M  ---------------

OS: Windows Server 2012 , 64 bit Build 9200 (6.2.9200.16384)

CPU:total 8 (initial active 8) (2 cores per cpu, 1 threads per core) family 6 model 62 stepping 4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, aes, clmul, erms, tsc

Memory: 4k page, physical 16776756k(8895804k free), swap 31456820k(18806572k free)

vm_info: Java HotSpot(TM) 64-Bit Server VM (10.0.2+13) for windows-amd64 JRE (10.0.2+13), built on Jun 28 2018 01:57:56 by "mach5one" with MS VC++ 12.0 (VS2013)

END.
Comment 1 Mark Thomas 2018-08-15 09:41:05 UTC
I haven't seen any APR crashes for a while now on any platform.

Some indication of how to reproduce this crash is going to be needed to investigate this further.
Comment 2 jan.pfeifer 2018-08-15 11:29:08 UTC
It happens several times a day or even several times a hour. But it happens in random manner. No single particular request can do it. 

There are no clues in logs. At least for me. I have one qutie busy image serving servlet which produces tons of different client abort exceptions, closeNowException, Stream is not writable etc.

I can only guess it is about concurent access to it. I have other logs available, if you want to look.

After latest crash i have found this (only relevant log output before crash):

Caused by: java.io.IOException: Unexpected error [120,001] writing data to the APR/native socket [975,220,944] with wrapper [org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper@1ce3e909:975220944].
	at org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper.doWriteInternal(AprEndpoint.java:2712)
	at org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper.doWrite(AprEndpoint.java:2640)
	at org.apache.tomcat.util.net.SocketWrapperBase.doWrite(SocketWrapperBase.java:691)
	at org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking(SocketWrapperBase.java:628)
	at org.apache.tomcat.util.net.SocketWrapperBase.flush(SocketWrapperBase.java:618)
	at org.apache.coyote.http2.Http2UpgradeHandler.writeBody(Http2UpgradeHandler.java:653)
	at org.apache.coyote.http2.Stream$StreamOutputBuffer.flush(Stream.java:777)
	at org.apache.coyote.http2.Stream$StreamOutputBuffer.doWrite(Stream.java:734)
	at org.apache.coyote.http2.Http2OutputBuffer.doWrite(Http2OutputBuffer.java:59)
	at org.apache.coyote.Response.doWrite(Response.java:599)
	at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:328)

Do you want me to run some specific test or post server config or other logs?
Comment 3 Christopher Schultz 2018-08-15 13:45:06 UTC
(In reply to jan.pfeifer from comment #2)
> It happens several times a day or even several times a hour.

At least it's reproducible, even if you can't guarantee a trigger.

> After latest crash i have found this (only relevant log output before crash):
> 
> Caused by: java.io.IOException: Unexpected error [120,001] writing data to
> the APR/native socket [975,220,944] with wrapper
> [org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper@1ce3e909:975220944].

That error code [120001] is:

   APR_UTIL_START_STATUS  (100000)
+     APR_OS_START_ERROR  ( 20000)
+        (os error code)  (     1)

The OS error code is, unhelpfully, "Operation not permitted", at least on *NIX. I'm not sure how that maps to Windows, which usually has error codes with negative values.

Are you maybe trying to write-back to a socket which should be closed?

Is this NIO+OpenSSL or APR+OpenSSL? I'm guessing APR+OpenSSL. Are you able to try using NIO+JSSE or NIO+OpenSSL?
Comment 4 jan.pfeifer 2018-08-15 14:51:08 UTC
(In reply to Christopher Schultz from comment #3)

Current configuration is for HTTP/2 so APR+openSSL

 <Connector port="443" protocol="org.apache.coyote.http11.Http11AprProtocol"
               maxThreads="150" SSLEnabled="true">
               <UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol" compression="on" />

	<SSLHostConfig>
            <Certificate certificateKeyFile="*.pem"
                     certificateFile="*.pem"
                     certificateChainFile="*.pem"
                         type="RSA" />
        </SSLHostConfig>
    </Connector>

I am using Spring framework, so only "direct write" is when i manualy read images and write them to the response output stream. There is also synchronized block where current thread checks if requested image is being resized. if so then it waits on task to finish. Actual code:

//tasks is ConcurentHashMap
try {
                String syncKey = target.getAbsolutePath();
                Thread task;

                synchronized (this) {
                    task = tasks.get(syncKey);                    
                    if (task == null && !target.exists()) {
                        //debug("START", request);
                        ResizeTask resizeTask = new ResizeTask(source, target, size);
                        resizeTask.setDebug(isDebug);
                        resizeTask.setLink(getFullURL(request));
                        task = new Thread(resizeTask);
                        tasks.put(syncKey, task);
                        task.start();
                    }
                }

                if(task != null) task.join();
                tasks.remove(syncKey);
                //debug("FINISH "+(System.currentTimeMillis() - time), request);

            } catch (InterruptedException e) {
                e.printStackTrace();
            }


But it seems lower level problem. To answer your question, yes i can try NIO(2) jsse/openssl.
Comment 5 jan.pfeifer 2018-08-16 10:55:54 UTC
With NIO2 + OpenSSL no crash so far. No problems detected. Except it cames with new set of client abort IO exceptions:

"The specified network name is no longer available" and "An existing connection was forcibly closed by the remote host".

Are there any generic way to detect it? I was forced to enclose stream.write() to its own try/catch and swallow any IOException it produces.

To our problem: I cant see any other way how to find culprit on my side. Some kind of stress test would propably trigger it, but with the same result we have now. I guess it is still related to image serving part as it is only "complex" part of webapp. Crawlers and "image thieves" spams it a lot. There can be let say 50 request per second, sometimes for same image, sometimes all ends with some kind of "client abort exception".
Comment 6 Christopher Schultz 2018-08-16 15:06:31 UTC
(In reply to jan.pfeifer from comment #5)
> With NIO2 + OpenSSL no crash so far. No problems detected. Except it comes
> with new set of client abort IO exceptions:
> 
> "The specified network name is no longer available" and "An existing
> connection was forcibly closed by the remote host".

This is actually what I was hoping to see happen.

tcnative is ... the bare minimum code to get things working; it doesn't have a huge amount of robustness especially when it comes to all the weird possibilities when a network connection is involved.

The fact that NIO is telling you that things are in a bad state means that it's actually true: the bug in tcnative that causes this crash is merely bad error-handling and not just tcnative not being able to push bytes around. That bad error-handling can probably be fixed with enough analysis and maybe some trial-and-error testing on your end (if you are willing to be a guinea pig).

But the fact remains: something "bad" is really happening with your clients and/or your network and that is the true source of the problem.

> Are there any generic way to detect it? I was forced to enclose
> stream.write() to its own try/catch and swallow any IOException it produces.

That's kind of par for the course, isn't it? Any IO operation can fail for any reason. If you don't catch it and handle it, Tomcat will (eventually) log it. What else did you expect to happen?

> To our problem: I cant see any other way how to find culprit on my side.
> Some kind of stress test would probably trigger it, but with the same result
> we have now. I guess it is still related to image serving part as it is only
> "complex" part of webapp. Crawlers and "image thieves" spams it a lot. There
> can be let say 50 request per second, sometimes for same image, sometimes
> all ends with some kind of "client abort exception".

A "client abort" exception happens when the client makes a request and then hangs-up the phone before you complete the response. It's fairly common and I wouldn't really expect tcnative to crash under that circumstance, but it's certainly possible.

In general, ClientAbortExceptions can safely be completely ignored, so don't worry too much about the fact that they are happening. On a busy site with large responses (e.g. images) I'd expect lots of them.

Let's keep this BZ issue open and continue to talk about the Java + native stack traces you have and try to get this resolved. ERROR_ACCESS_VIOLATION is the same as a segfault (basically null-pointer exception and/or use-after-free). In this case, it looks like dereferencing a pointer which probably isn't a pointer (its value is way too low to be valid).

Can you try one more thing for me? Downgrade to Java 1.8.0_whatever and revert your configuration to use APR+tcnative again and let us know if the crashes (actual segfaults that kill the JVM) continue.

But let's take your ClientAbortException stuff onto the user list to see if we can't find maybe some more efficient ways to get your images generated. That synchronized block looks suspicious to me, and creating new threads all the time is likely to lead to instability.
Comment 7 jan.pfeifer 2018-08-17 08:49:11 UTC
(In reply to Christopher Schultz from comment #6)

> That's kind of par for the course, isn't it? Any IO operation can fail for
> any reason. If you don't catch it and handle it, Tomcat will (eventually)
> log it. What else did you expect to happen?

I was only surprised by number of different IO exceptions. For some (Stream is not writable) it is not very clear what do they mean.

> A "client abort" exception happens when the client makes a request and then
> hangs-up the phone before you complete the response. It's fairly common and
> I wouldn't really expect tcnative to crash under that circumstance, but it's
> certainly possible.
> 
> In general, ClientAbortExceptions can safely be completely ignored, so don't
> worry too much about the fact that they are happening. On a busy site with
> large responses (e.g. images) I'd expect lots of them.

I undestand that. As I stated above it was only mix of different IOException types 
and quantity what confused me.
 
> Can you try one more thing for me? Downgrade to Java 1.8.0_whatever and
> revert your configuration to use APR+tcnative again and let us know if the
> crashes (actual segfaults that kill the JVM) continue.

I managed the downgrade. Alas lots of traffic ceased at the moment and server seems stable. I removed HTTP/2 upgrade as I understand it needs Java 9. Or am I wrong?


> But let's take your ClientAbortException stuff onto the user list to see if
> we can't find maybe some more efficient ways to get your images generated.
> That synchronized block looks suspicious to me, and creating new threads all
> the time is likely to lead to instability.

In first version i wasnt using threads at all. It seemed unlikely to get several concurrent request for one image. If it would happened the later one could simply overwrite image or respond with error. But since the crashes started i suspected thread concurency to be a problem. 

New threads are only spawned when images is requested for the first time. Images is downloaded from different machine, resized and webp version is created. WebP is now created with an external program. I was using native lib and imageio before.

I use threads and ConcurentHashMap<String, Thread> only to ensure that incoming requests will wait until image processing is finished. This only applies for requests for the same image. Others are free to get cached image or to start new resizing thread. Other option would be to allow only one image process at time, but it would have negative performance impact i guess. 

Another strange thing i got into was out of memory error (4GB) yesterday evening. No obvious reason of the problem. I have deleted image cache, so they are recreated now. Everything runs well.
Comment 8 Mark Thomas 2018-08-17 08:56:36 UTC
You mentioned imageio. I'm not sure at what point you stooped using it but there have been consistent reports over the years of imageio holding on to references to I/O streams and using them long after they should have been finished with. This has been the cause of APR crashes in the past. e.g.: https://markmail.org/message/xr3c5dixxzetpbez
Comment 9 jan.pfeifer 2018-08-17 09:28:32 UTC
(In reply to Mark Thomas from comment #8)
> You mentioned imageio. I'm not sure at what point you stooped using it but
> there have been consistent reports over the years of imageio holding on to
> references to I/O streams and using them long after they should have been
> finished with. This has been the cause of APR crashes in the past. e.g.:
> https://markmail.org/message/xr3c5dixxzetpbez

Yes, you are right, that was the reason. But there were several chrashes after the change. First was from https://bitbucket.org/luciad/webp-imageio, the second is from https://developers.google.com/speed/webp/download (img2webp.exe).
Comment 10 Christopher Schultz 2018-08-17 14:15:42 UTC
Have a look at this page:
https://wiki.apache.org/tomcat/FAQ/KnownIssues#ImageIOIssues

Try going back up to Java 10 and wrapping your streaming-ImageIO uses in the class shown in that wiki page and re-running. If the crashes stop, we can go ahead and blame them on ImageIO, but really tcnative should not crash in this scenario... I'd expect that you'd just get lots of IOExceptions instead. So tcnative isn't doing its job under this circumstance.

(n.b.: http/2 does not require Java 9.)

Unfortunately, I don't have  win32 debugging environment available to me, so I'm not able to turn the 0xe6fc9 offset into a meaningful line of code to investigate. It's obviously somewhere in the tcnative implementation of SSLSocket.handshake (or something it calls). That function has ~100 lines of code in it, but I'm guessing it's crashing towards the beginning, probably on a read.

Can you use a DEBUG build of tcnative to get a better crash report? I'm used to seeing more of a stack trace in native crash dumps.
Comment 11 jan.pfeifer 2018-08-17 15:38:19 UTC
(In reply to Christopher Schultz from comment #10)
> Have a look at this page:
> https://wiki.apache.org/tomcat/FAQ/KnownIssues#ImageIOIssues

If I undestand correctly there is only problem when using ImageIO.write with response output stream. I do not use it that way. I write with ImageIO to filesystem. Then, when image is ready, I read it again and write it to response stream via Buffered IN/Out stream.

> Try going back up to Java 10 and wrapping your streaming-ImageIO uses in the
> class shown in that wiki page and re-running. If the crashes stop, we can go
> ahead and blame them on ImageIO, but really tcnative should not crash in
> this scenario... I'd expect that you'd just get lots of IOExceptions
> instead. So tcnative isn't doing its job under this circumstance.

I left if for weekend with Java 8 to see what happens. Yes there are many IOExceptions already. No crash so far.

> Unfortunately, I don't have  win32 debugging environment available to me, so
> I'm not able to turn the 0xe6fc9 offset into a meaningful line of code to
> investigate. It's obviously somewhere in the tcnative implementation of
> SSLSocket.handshake (or something it calls). That function has ~100 lines of
> code in it, but I'm guessing it's crashing towards the beginning, probably
> on a read.
> 
> Can you use a DEBUG build of tcnative to get a better crash report? I'm used
> to seeing more of a stack trace in native crash dumps.

I will try to get DEBUG build on monday. But there is a lot of other debug info already. As I posted earlier i can attach it. Tomcat generated very long bug report from which i posted the top part and sysinfo part, and several gigs big mdmp file. Do you want to see them or we stick to DEBUG build of tcnative plan?
Comment 12 Christopher Schultz 2018-08-17 16:42:11 UTC
(In reply to jan.pfeifer from comment #11)
> (In reply to Christopher Schultz from comment #10)
> > Have a look at this page:
> > https://wiki.apache.org/tomcat/FAQ/KnownIssues#ImageIOIssues
> 
> If I undestand correctly there is only problem when using ImageIO.write with
> response output stream. I do not use it that way. I write with ImageIO to
> filesystem. Then, when image is ready, I read it again and write it to
> response stream via Buffered IN/Out stream.

Hmm, yes, this was in cases where ImageIO was being used to write to the servlet output stream. So it looks like that isn't the problem.

> > Try going back up to Java 10 and wrapping your streaming-ImageIO uses in the
> > class shown in that wiki page and re-running. If the crashes stop, we can go
> > ahead and blame them on ImageIO, but really tcnative should not crash in
> > this scenario... I'd expect that you'd just get lots of IOExceptions
> > instead. So tcnative isn't doing its job under this circumstance.
> 
> I left if for weekend with Java 8 to see what happens. Yes there are many
> IOExceptions already. No crash so far.

Just to confirm: this is Java 8 with APR+OpenSSL, not NIO+OpenSSL, correct?

If that's the case, there might be some kind of incompatibility between tcnative and Java 9. That would be very surprising to me, but that seems to be where the data are leading us.

> > Unfortunately, I don't have  win32 debugging environment available to me, so
> > I'm not able to turn the 0xe6fc9 offset into a meaningful line of code to
> > investigate. It's obviously somewhere in the tcnative implementation of
> > SSLSocket.handshake (or something it calls). That function has ~100 lines of
> > code in it, but I'm guessing it's crashing towards the beginning, probably
> > on a read.
> > 
> > Can you use a DEBUG build of tcnative to get a better crash report? I'm used
> > to seeing more of a stack trace in native crash dumps.
> 
> I will try to get DEBUG build on monday. But there is a lot of other debug
> info already. As I posted earlier i can attach it. Tomcat generated very
> long bug report from which i posted the top part and sysinfo part, and
> several gigs big mdmp file. Do you want to see them or we stick to DEBUG
> build of tcnative plan?

The Java stack traces are less important than the native stack trace. There was only a single item in the native stack trace you already posted. Can you post the part of the native report that is labelled "------ T H R E A D -------"?

That should include a lot more relevant information.
Comment 13 jan.pfeifer 2018-08-20 06:47:09 UTC
(In reply to Christopher Schultz from comment #12)

> Just to confirm: this is Java 8 with APR+OpenSSL, not NIO+OpenSSL, correct?

Yes, actual configuration APR+OpenSSL Java 8. Three days without crash.

> The Java stack traces are less important than the native stack trace. There
> was only a single item in the native stack trace you already posted. Can you
> post the part of the native report that is labelled "------ T H R E A D
> -------"?
> 
> That should include a lot more relevant information.


Two last Thread reports:

---------------  T H R E A D  ---------------

Current thread (0x000000002f30d800):  JavaThread "https-openssl-apr-443-exec-69" daemon [_thread_in_native, id=36360, stack(0x000000003f5e0000,0x000000003f6e0000)]

Stack: [0x000000003f5e0000,0x000000003f6e0000],  sp=0x000000003f6ddd80,  free space=1015k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [tcnative-1.dll+0xe0a8f]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 10218  org.apache.tomcat.jni.Socket.sendb(JLjava/nio/ByteBuffer;II)I (0 bytes) @ 0x0000000014d3c59f [0x0000000014d3c540+0x000000000000005f]
J 15904 c2 org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking()V (90 bytes) @ 0x000000001545cbcc [0x000000001545c880+0x000000000000034c]
J 16170 c2 org.apache.tomcat.util.net.SocketWrapperBase.flush(Z)Z (20 bytes) @ 0x0000000014860a78 [0x0000000014860a40+0x0000000000000038]
J 15983 c2 org.apache.catalina.connector.CoyoteOutputStream.write([BII)V (26 bytes) @ 0x00000000154a8f2c [0x00000000154a51c0+0x0000000000003d6c]
J 16090 c2 java.io.BufferedOutputStream.write([BII)V java.base@10.0.2 (67 bytes) @ 0x00000000154e5660 [0x00000000154e5560+0x0000000000000100]
J 18370 c2 com.m2000.shop.controllers.DefaultController.image(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljavax/servlet/http/HttpServletResponse;Ljavax/servlet/http/HttpServletRequest;)V (1129 bytes) @ 0x00000000159f12f4 [0x00000000159ec920+0x00000000000049d4]
J 19092 c2 jdk.internal.reflect.GeneratedMethodAccessor159.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; (98 bytes) @ 0x0000000015ba9be8 [0x0000000015ba9b00+0x00000000000000e8]
J 19001 c2 org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(Lorg/springframework/web/context/request/NativeWebRequest;Lorg/springframework/web/method/support/ModelAndViewContainer;[Ljava/lang/Object;)Ljava/lang/Object; (148 bytes) @ 0x0000000015b26018 [0x0000000015b25ca0+0x0000000000000378]
J 18530 c2 org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(Lorg/springframework/web/context/request/ServletWebRequest;Lorg/springframework/web/method/support/ModelAndViewContainer;[Ljava/lang/Object;)V (142 bytes) @ 0x0000000015a6e884 [0x0000000015a6e840+0x0000000000000044]
J 18568 c2 org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;Lorg/springframework/web/method/HandlerMethod;)Lorg/springframework/web/servlet/ModelAndView; (328 bytes) @ 0x0000000015a92a48 [0x0000000015a90d60+0x0000000000001ce8]
J 19592 c2 org.springframework.web.servlet.DispatcherServlet.doDispatch(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (594 bytes) @ 0x0000000015ca8c20 [0x0000000015ca8140+0x0000000000000ae0]
J 18099 c2 org.springframework.web.servlet.DispatcherServlet.doService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (360 bytes) @ 0x0000000015948490 [0x0000000015947ee0+0x00000000000005b0]
J 18074 c2 org.springframework.web.servlet.FrameworkServlet.processRequest(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (298 bytes) @ 0x0000000015919044 [0x00000000159171c0+0x0000000000001e84]
J 19591 c2 javax.servlet.http.HttpServlet.service(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (269 bytes) @ 0x0000000015ca6d70 [0x0000000015ca6bc0+0x00000000000001b0]
J 14832 c2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x00000000152cec10 [0x00000000152ce2e0+0x0000000000000930]
J 17775 c2 org.apache.tomcat.websocket.server.WsFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (139 bytes) @ 0x000000001580d614 [0x000000001580d5c0+0x0000000000000054]
J 14832 c2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x00000000152ce6d8 [0x00000000152ce2e0+0x00000000000003f8]
J 18045 c2 com.m2000.shop.filters.ThreadLocalFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (26 bytes) @ 0x00000000158ebecc [0x00000000158ebd40+0x000000000000018c]
J 14832 c2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x00000000152ce6d8 [0x00000000152ce2e0+0x00000000000003f8]
J 19329 c2 com.m2000.shop.filters.RedirectFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (290 bytes) @ 0x0000000015c2a564 [0x0000000015c29fc0+0x00000000000005a4]
J 14832 c2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x00000000152ce6d8 [0x00000000152ce2e0+0x00000000000003f8]
J 17964 c2 com.m2000.shop.filters.CacheFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (106 bytes) @ 0x00000000158a44e8 [0x00000000158a4000+0x00000000000004e8]
J 14832 c2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x00000000152ce6d8 [0x00000000152ce2e0+0x00000000000003f8]
J 17372 c2 org.apache.logging.log4j.web.Log4jServletFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (79 bytes) @ 0x0000000015736264 [0x00000000157360e0+0x0000000000000184]
J 14832 c2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x00000000152ce6d8 [0x00000000152ce2e0+0x00000000000003f8]
J 19210 c2 org.apache.catalina.core.StandardWrapperValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (1306 bytes) @ 0x0000000015a5f36c [0x0000000015a5ef20+0x000000000000044c]
J 18084 c2 org.apache.catalina.authenticator.AuthenticatorBase.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (821 bytes) @ 0x0000000015927a3c [0x0000000015926f80+0x0000000000000abc]
J 18347 c2 org.apache.catalina.core.StandardHostValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (382 bytes) @ 0x00000000159ceb98 [0x00000000159ce780+0x0000000000000418]
J 19342 c2 org.apache.catalina.core.StandardEngineValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (77 bytes) @ 0x0000000015bc36b4 [0x0000000015bc3480+0x0000000000000234]
J 18087 c2 org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (594 bytes) @ 0x0000000015939694 [0x0000000015938540+0x0000000000001154]
J 17959 c2 org.apache.catalina.connector.CoyoteAdapter.service(Lorg/apache/coyote/Request;Lorg/apache/coyote/Response;)V (892 bytes) @ 0x000000001588fd3c [0x000000001588e6c0+0x000000000000167c]
J 17126 c2 org.apache.coyote.AbstractProcessorLight.process(Lorg/apache/tomcat/util/net/SocketWrapperBase;Lorg/apache/tomcat/util/net/SocketEvent;)Lorg/apache/tomcat/util/net/AbstractEndpoint$Handler$SocketState; (248 bytes) @ 0x00000000156d3b14 [0x00000000156d3980+0x0000000000000194]
J 18826 c2 org.apache.coyote.http2.StreamRunnable.run()V (12 bytes) @ 0x0000000015b0a6f0 [0x0000000015b0a5c0+0x0000000000000130]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@10.0.2
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@10.0.2
j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
J 14367 c1 java.lang.Thread.run()V java.base@10.0.2 (17 bytes) @ 0x000000000df9169c [0x000000000df91560+0x000000000000013c]
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), writing address 0x00000000000001a0


Register to memory mapping:

RIP=0x00000001800e0a8f tcnative-1.dll
RAX=0x0000000000000000 is an unknown value
RBX=0x000000003c892b10 is an unknown value
RCX=0x00000001801ccca0 tcnative-1.dll
RDX=0x00000000fffffff0 is an unknown value
RSP=0x000000003f6ddd80 is pointing into the stack for thread: 0x000000002f30d800
RBP=0x0000000000000009 is an unknown value
RSI=0x0000000000000017 is an unknown value
RDI=0x00000000ffffffff is an unknown value
R8 =0x0000000000000040 is an unknown value
R9 =0x0000000000000051 is an unknown value
R10=0x00000000000000a4 is an unknown value
R11=0x00000000ffffffff is an unknown value
R12=0x000000003c893a36 is an unknown value
R13=0x000000003c870680 is an unknown value
R14=0x0000000000000000 is an unknown value
R15=0x0000000000000000 is an unknown value


Registers:
RAX=0x0000000000000000, RBX=0x000000003c892b10, RCX=0x00000001801ccca0, RDX=0x00000000fffffff0
RSP=0x000000003f6ddd80, RBP=0x0000000000000009, RSI=0x0000000000000017, RDI=0x00000000ffffffff
R8 =0x0000000000000040, R9 =0x0000000000000051, R10=0x00000000000000a4, R11=0x00000000ffffffff
R12=0x000000003c893a36, R13=0x000000003c870680, R14=0x0000000000000000, R15=0x0000000000000000
RIP=0x00000001800e0a8f, EFLAGS=0x0000000000010286

Top of Stack: (sp=0x000000003f6ddd80)
0x000000003f6ddd80:   000000003c892b10 0000000000000008
0x000000003f6ddd90:   000000003c892b10 000000003c870680
0x000000003f6ddda0:   00000000000003e8 0000000070675957
0x000000003f6dddb0:   0000000000000000 0000000070687c26
0x000000003f6dddc0:   0000001700000009 0000000000000009
0x000000003f6dddd0:   000000002d170e00 0000000000000000
0x000000003f6ddde0:   0000000000000000 0000000000000000
0x000000003f6dddf0:   0000000000000000 000000002d170e00
0x000000003f6dde00:   0000000000000000 0000000000000000
0x000000003f6dde10:   000021823ec6d43c 00000001800e0075
0x000000003f6dde20:   000000003f6ddf60 000000002d170e00
0x000000003f6dde30:   000000003f6ddf08 0000000000000009
0x000000003f6dde40:   000000002d170e00 0000000000000009
0x000000003f6dde50:   000000003c892b10 00000001800e5c2e
0x000000003f6dde60:   0000000000000000 0000000000000000
0x000000003f6dde70:   000000002a992ba0 000000002f30d800 

Instructions: (pc=0x00000001800e0a8f)
0x00000001800e0a6f:   f6 41 70 08 75 08 48 8b cb e8 d3 64 00 00 42 8d
0x00000001800e0a7f:   04 37 e9 de f7 ff ff 33 ff 48 8b 83 80 00 00 00
0x00000001800e0a8f:   44 89 b0 a0 01 00 00 8b c7 e9 c7 f7 ff ff ba 9e
0x00000001800e0a9f:   00 00 00 4c 8d 0d a7 c7 0e 00 b9 14 00 00 00 44

---------------  T H R E A D  ---------------

Current thread (0x000000002ce93000):  JavaThread "https-openssl-apr-443-exec-30" daemon [_thread_in_native, id=22504, stack(0x000000003ad30000,0x000000003ae30000)]

Stack: [0x000000003ad30000,0x000000003ae30000],  sp=0x000000003ae2e8f0,  free space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [tcnative-1.dll+0xe0a8f]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 8894  org.apache.tomcat.jni.Socket.sendb(JLjava/nio/ByteBuffer;II)I (0 bytes) @ 0x000000001540cf1f [0x000000001540cec0+0x000000000000005f]
J 14207 c2 org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper.doWrite(ZLjava/nio/ByteBuffer;)V (242 bytes) @ 0x000000001591c658 [0x000000001591c480+0x00000000000001d8]
J 16174 c2 org.apache.tomcat.util.net.SocketWrapperBase.flush(Z)Z (20 bytes) @ 0x0000000015c49db0 [0x0000000015c49d00+0x00000000000000b0]
j  org.apache.coyote.http2.Http2UpgradeHandler.writeGoAwayFrame(IJ[B)V+125
j  org.apache.coyote.http2.Http2UpgradeHandler.closeConnection(Lorg/apache/coyote/http2/Http2Exception;)V+22
j  org.apache.coyote.http2.Stream.close(Lorg/apache/coyote/http2/Http2Exception;)V+124
J 18503 c2 org.apache.coyote.http2.StreamRunnable.run()V (12 bytes) @ 0x00000000156be630 [0x00000000156bdfa0+0x0000000000000690]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@10.0.2
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@10.0.2
j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
j  java.lang.Thread.run()V+11 java.base@10.0.2
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), writing address 0x00000000000001a0


Register to memory mapping:

RIP=0x00000001800e0a8f tcnative-1.dll
RAX=0x0000000000000000 is an unknown value
RBX=0x000000003a5bccf0 is an unknown value
RCX=0x00000001801ccca0 tcnative-1.dll
RDX=0x00000000fffffff0 is an unknown value
RSP=0x000000003ae2e8f0 is pointing into the stack for thread: 0x000000002ce93000
RBP=0x0000000000000074 is an unknown value
RSI=0x0000000000000017 is an unknown value
RDI=0x00000000ffffffff is an unknown value
R8 =0x0000000000000040 is an unknown value
R9 =0x0000000000000051 is an unknown value
R10=0x00000000000000a4 is an unknown value
R11=0x00000000ffffffff is an unknown value
R12=0x0000000039098416 is an unknown value
R13=0x00000000390f6fa0 is an unknown value
R14=0x0000000000000000 is an unknown value
R15=0x0000000000000000 is an unknown value


Registers:
RAX=0x0000000000000000, RBX=0x000000003a5bccf0, RCX=0x00000001801ccca0, RDX=0x00000000fffffff0
RSP=0x000000003ae2e8f0, RBP=0x0000000000000074, RSI=0x0000000000000017, RDI=0x00000000ffffffff
R8 =0x0000000000000040, R9 =0x0000000000000051, R10=0x00000000000000a4, R11=0x00000000ffffffff
R12=0x0000000039098416, R13=0x00000000390f6fa0, R14=0x0000000000000000, R15=0x0000000000000000
RIP=0x00000001800e0a8f, EFLAGS=0x0000000000010286

Top of Stack: (sp=0x000000003ae2e8f0)
0x000000003ae2e8f0:   000000003a5bccf0 0000000000000008
0x000000003ae2e900:   000000003a5bccf0 00000000390f6fa0
0x000000003ae2e910:   00000001e23215a0 000000003ae2e9b0
0x000000003ae2e920:   00000054000001f4 0000000014f67f5c
0x000000003ae2e930:   0000001700000074 0000000000000074
0x000000003ae2e940:   00000000306a29f0 0000000000010000
0x000000003ae2e950:   00000006c65bb840 00000006e23215a0
0x000000003ae2e960:   0000000600000000 00000000306a29f0
0x000000003ae2e970:   0000000000000048 0000000000000000
0x000000003ae2e980:   00002283b2b09f1d 00000001800e0075
0x000000003ae2e990:   000000003ae2ead0 00000000306a29f0
0x000000003ae2e9a0:   000000003ae2ea78 0000000000000074
0x000000003ae2e9b0:   00000000306a29f0 0000000000000074
0x000000003ae2e9c0:   000000003a5bccf0 00000001800e5c2e
0x000000003ae2e9d0:   0000000000000000 0000000000000000
0x000000003ae2e9e0:   000000002afd2ba0 000000002ce93000 

Instructions: (pc=0x00000001800e0a8f)
0x00000001800e0a6f:   f6 41 70 08 75 08 48 8b cb e8 d3 64 00 00 42 8d
0x00000001800e0a7f:   04 37 e9 de f7 ff ff 33 ff 48 8b 83 80 00 00 00
0x00000001800e0a8f:   44 89 b0 a0 01 00 00 8b c7 e9 c7 f7 ff ff ba 9e
0x00000001800e0a9f:   00 00 00 4c 8d 0d a7 c7 0e 00 b9 14 00 00 00 44
Comment 14 jan.pfeifer 2018-08-20 07:03:17 UTC
If logs did not help, can you point me where to get that DEBUG build of tcnative?
Comment 15 Christopher Schultz 2018-08-21 14:22:04 UTC
(In reply to jan.pfeifer from comment #13)
> (In reply to Christopher Schultz from comment #12)
> 
> > Just to confirm: this is Java 8 with APR+OpenSSL, not NIO+OpenSSL, correct?
> 
> Yes, actual configuration APR+OpenSSL Java 8. Three days without crash.

That's good to hear. It's surprising, but at least you have a configuration you can fall-back to if you don't want crashes.
Comment 16 Christopher Schultz 2018-08-21 16:04:36 UTC
(In reply to jan.pfeifer from comment #13)
> Two last Thread reports:
> 
> ---------------  T H R E A D  ---------------
> 
> Current thread (0x000000002f30d800):  JavaThread
> "https-openssl-apr-443-exec-69" daemon [_thread_in_native, id=36360,
> stack(0x000000003f5e0000,0x000000003f6e0000)]
> 
> Stack: [0x000000003f5e0000,0x000000003f6e0000],  sp=0x000000003f6ddd80, 
> free space=1015k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)
> C  [tcnative-1.dll+0xe0a8f]
> 

No more information about the native frame. That must be a direct-native call...

> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> J 10218  org.apache.tomcat.jni.Socket.sendb(JLjava/nio/ByteBuffer;II)I (0
> bytes) @ 0x0000000014d3c59f [0x0000000014d3c540+0x000000000000005f]
> J 15904 c2 org.apache.tomcat.util.net.SocketWrapperBase.flushBlocking()V (90
> bytes) @ 0x000000001545cbcc [0x000000001545c880+0x000000000000034c]
> J 16170 c2 org.apache.tomcat.util.net.SocketWrapperBase.flush(Z)Z (20 bytes)
> @ 0x0000000014860a78 [0x0000000014860a40+0x0000000000000038]
> J 15983 c2 org.apache.catalina.connector.CoyoteOutputStream.write([BII)V (26
> bytes) @ 0x00000000154a8f2c [0x00000000154a51c0+0x0000000000003d6c]
> J 16090 c2 java.io.BufferedOutputStream.write([BII)V java.base@10.0.2 (67
> bytes) @ 0x00000000154e5660 [0x00000000154e5560+0x0000000000000100]
> J 18370 c2
> com.m2000.shop.controllers.DefaultController.image(Ljava/lang/String;Ljava/
> lang/String;Ljava/lang/String;Ljavax/servlet/http/HttpServletResponse;Ljavax/
> servlet/http/HttpServletRequest;)V (1129 bytes) @ 0x00000000159f12f4
> [0x00000000159ec920+0x00000000000049d4]

Okay, this is good information:

1. The crash is occurring in jni.Socket.sendbb (where many crashes have historically been found)

2. Your code is intentionally invoking a write() on the socket/stream

That means that we can take a look at your code to see if there is any misuse of the streams. That is the #1 reason for crashes like these: the application misuses a stream (e.g. after it should have been recycled) and it's in an intermediate state.

tcnative should not crash, but we may be able to mitigate it by fixing your code.

Can you post as much of your code as possible? Feel free to do so privately if you don't want your code published in BZ.


> J 19092 c2
> jdk.internal.reflect.GeneratedMethodAccessor159.invoke(Ljava/lang/Object;
> [Ljava/lang/Object;)Ljava/lang/Object; (98 bytes) @ 0x0000000015ba9be8
> [0x0000000015ba9b00+0x00000000000000e8]
> J 19001 c2
> org.springframework.web.method.support.InvocableHandlerMethod.
> invokeForRequest(Lorg/springframework/web/context/request/NativeWebRequest;
> Lorg/springframework/web/method/support/ModelAndViewContainer;[Ljava/lang/
> Object;)Ljava/lang/Object; (148 bytes) @ 0x0000000015b26018
> [0x0000000015b25ca0+0x0000000000000378]

Coming through Spring.

> J 18530 c2
> org.springframework.web.servlet.mvc.method.annotation.
> ServletInvocableHandlerMethod.invokeAndHandle(Lorg/springframework/web/
> context/request/ServletWebRequest;Lorg/springframework/web/method/support/
> ModelAndViewContainer;[Ljava/lang/Object;)V (142 bytes) @ 0x0000000015a6e884
> [0x0000000015a6e840+0x0000000000000044]
> J 18568 c2
> org.springframework.web.servlet.mvc.method.annotation.
> RequestMappingHandlerAdapter.invokeHandlerMethod(Ljavax/servlet/http/
> HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;Lorg/
> springframework/web/method/HandlerMethod;)Lorg/springframework/web/servlet/
> ModelAndView; (328 bytes) @ 0x0000000015a92a48
> [0x0000000015a90d60+0x0000000000001ce8]
> J 19592 c2
> org.springframework.web.servlet.DispatcherServlet.doDispatch(Ljavax/servlet/
> http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (594
> bytes) @ 0x0000000015ca8c20 [0x0000000015ca8140+0x0000000000000ae0]
> J 18099 c2
> org.springframework.web.servlet.DispatcherServlet.doService(Ljavax/servlet/
> http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (360
> bytes) @ 0x0000000015948490 [0x0000000015947ee0+0x00000000000005b0]
> J 18074 c2
> org.springframework.web.servlet.FrameworkServlet.processRequest(Ljavax/
> servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V
> (298 bytes) @ 0x0000000015919044 [0x00000000159171c0+0x0000000000001e84]
> J 19591 c2
> javax.servlet.http.HttpServlet.service(Ljavax/servlet/http/
> HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (269 bytes) @
> 0x0000000015ca6d70 [0x0000000015ca6bc0+0x00000000000001b0]

From normal Tomcat dispatch. Great. At least there is no odd threading or anything going on, here. That makes it less likely to be a problem in your code.

> siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), writing address
> 0x00000000000001a0

Interesting. Most of the crashes in the past have been when reading data, not writing it. Therefore, most of the protections we have added have been e.g. null-checking before dereference, etc.

0x01a0 (416 decimal) is likely the byte-offset into a structure that tcnative is trying to write. That might help narrow-down what structure is being written (416 bytes is somewhat large), and even which field.

> Register to memory mapping:
> 
> RIP=0x00000001800e0a8f tcnative-1.dll
> RAX=0x0000000000000000 is an unknown value
> RBX=0x000000003c892b10 is an unknown value
> RCX=0x00000001801ccca0 tcnative-1.dll
> RDX=0x00000000fffffff0 is an unknown value
> RSP=0x000000003f6ddd80 is pointing into the stack for thread:
> 0x000000002f30d800
> RBP=0x0000000000000009 is an unknown value
> RSI=0x0000000000000017 is an unknown value
> RDI=0x00000000ffffffff is an unknown value
> R8 =0x0000000000000040 is an unknown value
> R9 =0x0000000000000051 is an unknown value
> R10=0x00000000000000a4 is an unknown value
> R11=0x00000000ffffffff is an unknown value
> R12=0x000000003c893a36 is an unknown value
> R13=0x000000003c870680 is an unknown value
> R14=0x0000000000000000 is an unknown value
> R15=0x0000000000000000 is an unknown value
> 
> 
> Registers:
> RAX=0x0000000000000000, RBX=0x000000003c892b10, RCX=0x00000001801ccca0,
> RDX=0x00000000fffffff0
> RSP=0x000000003f6ddd80, RBP=0x0000000000000009, RSI=0x0000000000000017,
> RDI=0x00000000ffffffff
> R8 =0x0000000000000040, R9 =0x0000000000000051, R10=0x00000000000000a4,
> R11=0x00000000ffffffff
> R12=0x000000003c893a36, R13=0x000000003c870680, R14=0x0000000000000000,
> R15=0x0000000000000000
> RIP=0x00000001800e0a8f, EFLAGS=0x0000000000010286
> 
> Top of Stack: (sp=0x000000003f6ddd80)
> 0x000000003f6ddd80:   000000003c892b10 0000000000000008
> 0x000000003f6ddd90:   000000003c892b10 000000003c870680
> 0x000000003f6ddda0:   00000000000003e8 0000000070675957
> 0x000000003f6dddb0:   0000000000000000 0000000070687c26
> 0x000000003f6dddc0:   0000001700000009 0000000000000009
> 0x000000003f6dddd0:   000000002d170e00 0000000000000000
> 0x000000003f6ddde0:   0000000000000000 0000000000000000
> 0x000000003f6dddf0:   0000000000000000 000000002d170e00
> 0x000000003f6dde00:   0000000000000000 0000000000000000
> 0x000000003f6dde10:   000021823ec6d43c 00000001800e0075
> 0x000000003f6dde20:   000000003f6ddf60 000000002d170e00
> 0x000000003f6dde30:   000000003f6ddf08 0000000000000009
> 0x000000003f6dde40:   000000002d170e00 0000000000000009
> 0x000000003f6dde50:   000000003c892b10 00000001800e5c2e
> 0x000000003f6dde60:   0000000000000000 0000000000000000
> 0x000000003f6dde70:   000000002a992ba0 000000002f30d800 
> 
> Instructions: (pc=0x00000001800e0a8f)
> 0x00000001800e0a6f:   f6 41 70 08 75 08 48 8b cb e8 d3 64 00 00 42 8d
> 0x00000001800e0a7f:   04 37 e9 de f7 ff ff 33 ff 48 8b 83 80 00 00 00
> 0x00000001800e0a8f:   44 89 b0 a0 01 00 00 8b c7 e9 c7 f7 ff ff ba 9e
> 0x00000001800e0a9f:   00 00 00 4c 8d 0d a7 c7 0e 00 b9 14 00 00 00 44

Disassembly (x86-64):

0:  f6 41 70 08             test   BYTE PTR [rcx+0x70],0x8
4:  75 08                   jne    0xe
6:  48 8b cb                mov    rcx,rbx
9:  e8 d3 64 00 00          call   0x64e1
e:  42 8d 04 37             lea    eax,[rdi+r14*1]
12: e9 de f7 ff ff          jmp    0xfffffffffffff7f5
17: 33 ff                   xor    edi,edi
19: 48 8b 83 80 00 00 00    mov    rax,QWORD PTR [rbx+0x80]
20: 44 89 b0 a0 01 00 00    mov    DWORD PTR [rax+0x1a0],r14d
27: 8b c7                   mov    eax,edi
29: e9 c7 f7 ff ff          jmp    0xfffffffffffff7f5
2e: ba 9e 00 00 00          mov    edx,0x9e
33: 4c 8d 0d a7 c7 0e 00    lea    r9,[rip+0xec7a7]        # 0xec7e1
3a: b9 14 00 00 00          mov    ecx,0x14
3f: 44                      rex.R 

Byte offset 20 has an instruction that references RAX + 0x01a0, and the value of RAX is 0x0000, so we try to copy the contents of R14 into an illegal address.

Simple, right?

> C  [tcnative-1.dll+0xe0a8f]

Same.

> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> J 8894  org.apache.tomcat.jni.Socket.sendb(JLjava/nio/ByteBuffer;II)I (0
> bytes) @ 0x000000001540cf1f [0x000000001540cec0+0x000000000000005f]

Same.

> J 14207 c2
> org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper.doWrite(ZLjava/nio/
> ByteBuffer;)V (242 bytes) @ 0x000000001591c658
> [0x000000001591c480+0x00000000000001d8]
> J 16174 c2 org.apache.tomcat.util.net.SocketWrapperBase.flush(Z)Z (20 bytes)
> @ 0x0000000015c49db0 [0x0000000015c49d00+0x00000000000000b0]
> j  org.apache.coyote.http2.Http2UpgradeHandler.writeGoAwayFrame(IJ[B)V+125
> j 
> org.apache.coyote.http2.Http2UpgradeHandler.closeConnection(Lorg/apache/
> coyote/http2/Http2Exception;)V+22
> j 
> org.apache.coyote.http2.Stream.close(Lorg/apache/coyote/http2/Http2Exception;
> )V+124
> J 18503 c2 org.apache.coyote.http2.StreamRunnable.run()V (12 bytes) @
> 0x00000000156be630 [0x00000000156bdfa0+0x0000000000000690]
> j 
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/
> ThreadPoolExecutor$Worker;)V+92 java.base@10.0.2
> j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@10.0.2
> j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
> j  java.lang.Thread.run()V+11 java.base@10.0.2
> v  ~StubRoutines::call_stub

This one is different: Tomcat is trying to close the stream after your code is done and is failing.

> siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), writing address
> 0x00000000000001a0

But this looks familiar.

> Instructions: (pc=0x00000001800e0a8f)
> 0x00000001800e0a6f:   f6 41 70 08 75 08 48 8b cb e8 d3 64 00 00 42 8d
> 0x00000001800e0a7f:   04 37 e9 de f7 ff ff 33 ff 48 8b 83 80 00 00 00
> 0x00000001800e0a8f:   44 89 b0 a0 01 00 00 8b c7 e9 c7 f7 ff ff ba 9e
> 0x00000001800e0a9f:   00 00 00 4c 8d 0d a7 c7 0e 00 b9 14 00 00 00 44

Yep, same native code.

Obviously, something is wrong with the socket-state management. My guess is that this is really "client has disconnected" and the state isn't being correctly detected by tcnative.

But why is it not crashing on Java 8?
Comment 17 Konstantin Kolinko 2018-08-22 17:28:47 UTC
Please add the following line to your catalina.properties file:

org.apache.catalina.connector.RECYCLE_FACADES=true

as documented at
https://wiki.apache.org/tomcat/FAQ/Troubleshooting_and_Diagnostics#Troubleshooting_unexpected_Response_state_problems
Comment 18 jan.pfeifer 2018-08-23 07:26:35 UTC
(In reply to Christopher Schultz from comment #16)

Its like observing black hole for me but i've came to some conslusions

> 
> But why is it not crashing on Java 8?

1. I have tried org.apache.catalina.connector.RECYCLE_FACADES=true parameter before and left it enabled. But I am quite sure it crashed at least once with parameter enabled. I've disabled it again.

2. HTTP/2. As I mentioned before I didnt know it is possible to have it working with Java 8. So I run it only with HTTP11. I have enabled HTTP/2 on Java 8 yesterday and got three crashes till now. 

3. AND/OR "malevolent" traffic. There was some strange access logs for the images. Now its is clear it is related to HTTP/2. As it ceased with HTTP1 and reappeared again now with HTTP/2. It seems some bots have problems with HTTP/2, requesting the same images over and over again and terminating connection in some "bad" way.

Last crash log (Java 8 + HTTP/2 RECYCLE_FACADES=false)

---------------  T H R E A D  ---------------

Current thread (0x0000000024278800):  JavaThread "https-openssl-apr-443-exec-39" daemon [_thread_in_native, id=52448, stack(0x000000002d850000,0x000000002d950000)]

siginfo: ExceptionCode=0xc0000005, writing address 0x00000000000001a0

Registers:
RAX=0x0000000000000000, RBX=0x000000002c5a69a0, RCX=0x00000001801ccca0, RDX=0x00000000fffffff0
RSP=0x000000002d94e190, RBP=0x0000000000000009, RSI=0x0000000000000017, RDI=0x00000000ffffffff
R8 =0x0000000000000006, R9 =0x000000000000004b, R10=0x0000000000000007, R11=0x00000000ffffffff
R12=0x000000002c4971c6, R13=0x00000000299b21c0, R14=0x0000000000000000, R15=0x0000000000000000
RIP=0x00000001800e0a8f, EFLAGS=0x0000000000010286

Top of Stack: (sp=0x000000002d94e190)
0x000000002d94e190:   000000002c5a69a0 0000000000000008
0x000000002d94e1a0:   000000002c5a69a0 00000000299b21c0
0x000000002d94e1b0:   0000000000000000 0000000000000000
0x000000002d94e1c0:   0000000000000000 00000000713333f2
0x000000002d94e1d0:   0000001700000009 0000000000000009
0x000000002d94e1e0:   0000000020c47220 000000007132cb5d
0x000000002d94e1f0:   000000002240cd00 00000000713b0b22
0x000000002d94e200:   0000000024278800 0000000020c47220
0x000000002d94e210:   0000000022df4640 0000000000000000
0x000000002d94e220:   00002181135f529d 00000001800e0075
0x000000002d94e230:   000000002d94e370 0000000020c47220
0x000000002d94e240:   000000002d94e318 0000000000000009
0x000000002d94e250:   0000000020c47220 0000000000000009
0x000000002d94e260:   000000002c5a69a0 00000001800e5c2e
0x000000002d94e270:   0000000000000000 0000000000000000
0x000000002d94e280:   000000001fa3da90 0000000024278800 

Instructions: (pc=0x00000001800e0a8f)
0x00000001800e0a6f:   f6 41 70 08 75 08 48 8b cb e8 d3 64 00 00 42 8d
0x00000001800e0a7f:   04 37 e9 de f7 ff ff 33 ff 48 8b 83 80 00 00 00
0x00000001800e0a8f:   44 89 b0 a0 01 00 00 8b c7 e9 c7 f7 ff ff ba 9e
0x00000001800e0a9f:   00 00 00 4c 8d 0d a7 c7 0e 00 b9 14 00 00 00 44 


Register to memory mapping:

RAX=0x0000000000000000 is an unknown value
RBX=0x000000002c5a69a0 is an unknown value
RCX=0x00000001801ccca0 is an unknown value
RDX=0x00000000fffffff0 is an unknown value
RSP=0x000000002d94e190 is pointing into the stack for thread: 0x0000000024278800
RBP=0x0000000000000009 is an unknown value
RSI=0x0000000000000017 is an unknown value
RDI=0x00000000ffffffff is an unknown value
R8 =0x0000000000000006 is an unknown value
R9 =0x000000000000004b is an unknown value
R10=0x0000000000000007 is an unknown value
R11=0x00000000ffffffff is an unknown value
R12=0x000000002c4971c6 is an unknown value
R13=0x00000000299b21c0 is an unknown value
R14=0x0000000000000000 is an unknown value
R15=0x0000000000000000 is an unknown value


Stack: [0x000000002d850000,0x000000002d950000],  sp=0x000000002d94e190,  free space=1016k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [tcnative-1.dll+0xe0a8f]
C  [tcnative-1.dll+0xe5c2e]
C  [tcnative-1.dll+0x104bc]
C  [tcnative-1.dll+0x56ff]
C  0x0000000002cbc5a5

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 7660  org.apache.tomcat.jni.Socket.sendb(JLjava/nio/ByteBuffer;II)I (0 bytes) @ 0x0000000002cbc51f [0x0000000002cbc4c0+0x5f]
J 12795 C2 org.apache.tomcat.util.net.AprEndpoint$AprSocketWrapper.doWrite(ZLjava/nio/ByteBuffer;)V (242 bytes) @ 0x00000000043caef0 [0x00000000043cac80+0x270]
J 15975 C2 org.apache.coyote.http2.Http2UpgradeHandler.writeBody(Lorg/apache/coyote/http2/Stream;Ljava/nio/ByteBuffer;IZ)V (218 bytes) @ 0x0000000003c10ba0 [0x0000000003c10680+0x520]
J 12532 C2 java.io.BufferedOutputStream.write([BII)V (67 bytes) @ 0x0000000004301c3c [0x0000000004300940+0x12fc]
J 18229 C2 com.m2000.shop.controllers.DefaultController.image(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljavax/servlet/http/HttpServletResponse;Ljavax/servlet/http/HttpServletRequest;)V (1129 bytes) @ 0x000000000535b604 [0x0000000005358ac0+0x2b44]
J 18326 C2 sun.reflect.GeneratedMethodAccessor117.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; (98 bytes) @ 0x000000000444a64c [0x000000000444a560+0xec]
J 17923 C2 org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(Lorg/springframework/web/context/request/NativeWebRequest;Lorg/springframework/web/method/support/ModelAndViewContainer;[Ljava/lang/Object;)Ljava/lang/Object; (148 bytes) @ 0x0000000005241df8 [0x00000000052419c0+0x438]
J 16267 C2 org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;Lorg/springframework/web/method/HandlerMethod;)Lorg/springframework/web/servlet/ModelAndView; (328 bytes) @ 0x0000000003bc3cc0 [0x0000000003bc2b00+0x11c0]
J 15842 C2 org.springframework.web.servlet.DispatcherServlet.doDispatch(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (594 bytes) @ 0x0000000004c4a6d4 [0x0000000004c49a00+0xcd4]
J 15824 C2 org.springframework.web.servlet.DispatcherServlet.doService(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (360 bytes) @ 0x00000000042e6e78 [0x00000000042e65c0+0x8b8]
J 15862 C2 javax.servlet.http.HttpServlet.service(Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V (269 bytes) @ 0x0000000004c82c0c [0x0000000004c80e00+0x1e0c]
J 12161 C2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x000000000427e388 [0x000000000427dba0+0x7e8]
J 15472 C2 org.apache.tomcat.websocket.server.WsFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (139 bytes) @ 0x0000000001f5b5f4 [0x0000000001f5b5a0+0x54]
J 12161 C2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x000000000427dd34 [0x000000000427dba0+0x194]
J 15782 C2 com.m2000.shop.filters.ThreadLocalFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (26 bytes) @ 0x00000000021ee660 [0x00000000021ee480+0x1e0]
J 12161 C2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x000000000427dd34 [0x000000000427dba0+0x194]
J 16058 C2 com.m2000.shop.filters.RedirectFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (290 bytes) @ 0x0000000004046ac0 [0x0000000004046480+0x640]
J 12161 C2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x000000000427dd34 [0x000000000427dba0+0x194]
J 15816 C2 com.m2000.shop.filters.CacheFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (106 bytes) @ 0x0000000003ad14d4 [0x0000000003ad0f20+0x5b4]
J 12161 C2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x000000000427dd34 [0x000000000427dba0+0x194]
J 18379 C2 org.apache.logging.log4j.web.Log4jServletFilter.doFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;Ljavax/servlet/FilterChain;)V (79 bytes) @ 0x00000000053330ec [0x0000000005332ea0+0x24c]
J 12161 C2 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Ljavax/servlet/ServletRequest;Ljavax/servlet/ServletResponse;)V (388 bytes) @ 0x000000000427dd34 [0x000000000427dba0+0x194]
J 15812 C2 org.apache.catalina.core.StandardWrapperValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (1306 bytes) @ 0x000000000441ed08 [0x000000000441ea40+0x2c8]
J 16053 C2 org.apache.catalina.authenticator.AuthenticatorBase.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (821 bytes) @ 0x0000000004076e24 [0x0000000004076320+0xb04]
J 15807 C2 org.apache.catalina.core.StandardHostValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (382 bytes) @ 0x0000000003dd2ba0 [0x0000000003dd2880+0x320]
J 15779 C2 org.apache.catalina.core.StandardEngineValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (77 bytes) @ 0x0000000003dee714 [0x0000000003dee540+0x1d4]
J 15814 C2 org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(Lorg/apache/catalina/connector/Request;Lorg/apache/catalina/connector/Response;)V (594 bytes) @ 0x00000000044172d8 [0x00000000044170c0+0x218]
J 15701 C2 org.apache.catalina.connector.CoyoteAdapter.service(Lorg/apache/coyote/Request;Lorg/apache/coyote/Response;)V (892 bytes) @ 0x0000000003d179dc [0x0000000003d16f40+0xa9c]
J 14916 C2 org.apache.coyote.AbstractProcessorLight.process(Lorg/apache/tomcat/util/net/SocketWrapperBase;Lorg/apache/tomcat/util/net/SocketEvent;)Lorg/apache/tomcat/util/net/AbstractEndpoint$Handler$SocketState; (248 bytes) @ 0x0000000004141eb8 [0x0000000004141d20+0x198]
J 16392 C2 org.apache.coyote.http2.StreamRunnable.run()V (12 bytes) @ 0x0000000004d5b794 [0x0000000004d5b5e0+0x1b4]
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run()V+4
J 14749 C1 java.lang.Thread.run()V (17 bytes) @ 0x00000000049d22c4 [0x00000000049d2180+0x144]
v  ~StubRoutines::call_stub
Comment 19 jan.pfeifer 2018-08-23 07:45:49 UTC
It doesn't take long to crash. I can confirm RECYCLE_FACADES did not help. Currently I am running stable config NIO2 + HTTP/2 + Java 10. Let me know if you will need any further assistance.
Comment 20 jan.pfeifer 2018-08-24 06:15:18 UTC
Calling NIO2+HTTP/2 stable was quite premature. With this config I get out of memory (5G) every few hours. Strange, peak traffic isnt causing any obvious problem.
Comment 21 Christopher Schultz 2018-08-24 19:11:57 UTC
(In reply to jan.pfeifer from comment #20)
> Calling NIO2+HTTP/2 stable was quite premature. With this config I get out
> of memory (5G) every few hours. Strange, peak traffic isnt causing any
> obvious problem.

Just to confirm:

* NIO2 + HTTP/2 + Java 10 crashes. Same setup with Java 8 crashes.
* Limiting to HTTP/1.1 restores stability

Correct?
Comment 22 jan.pfeifer 2018-08-25 08:12:20 UTC
(In reply to Christopher Schultz from comment #21)
> (In reply to jan.pfeifer from comment #20)
> > Calling NIO2+HTTP/2 stable was quite premature. With this config I get out
> > of memory (5G) every few hours. Strange, peak traffic isnt causing any
> > obvious problem.
> 
> Just to confirm:
> 
> * NIO2 + HTTP/2 + Java 10 crashes. Same setup with Java 8 crashes.

Not actual crash. But freezes with out of memory error. (5Gigs available. Under normal circumstances under 1G is enough)

> * Limiting to HTTP/1.1 restores stability

Yes.
Comment 23 jan.pfeifer 2019-01-30 13:30:50 UTC
It has been 6 months now, was there any progress with bug? Should I re-test with newer versions of Tomcat?
Comment 24 Christopher Schultz 2019-01-30 20:41:43 UTC
(In reply to jan.pfeifer from comment #23)
> It has been 6 months now, was there any progress with bug? Should I re-test
> with newer versions of Tomcat?

There is currently a vote being taken on a new version of tcnative (1.2.21) which includes a fix for *a* memory leak. I'd be interested if it is the memory leak you described several months ago.

Once the vote passes (it looks like it will pass) and the version is released, try upgrading to that version and run in your NIO2 + HTTP/2 with the updated tcnative. If things work out, we'll call it a win. If not... I guess we'll have to keep investigating.

I'm sorry, I'm out of my element when it comes to debugging windows binaries.
Comment 25 Conrad T. Pino 2019-02-15 07:50:41 UTC
Across two (2) Tomcat releases:

	apache-tomcat-8.5.27
	apache-tomcat-8.5.31

I have accumulated eight (8) tcnative-1.dll crash logs with memory dumps:

	05/18/2018  02:20 PM            38,952 hs_err_pid1964.log
	05/18/2018  02:20 PM     5,265,420,258 hs_err_pid1964.mdmp
	11/07/2018  06:01 PM            38,793 hs_err_pid2260.log
	11/07/2018  06:01 PM     4,974,077,542 hs_err_pid2260.mdmp
	12/07/2018  05:24 PM            37,242 hs_err_pid2440.log
	12/07/2018  05:24 PM     7,737,765,554 hs_err_pid2440.mdmp
	01/13/2019  06:13 PM            37,617 hs_err_pid3040.log
	01/13/2019  06:13 PM     4,264,832,306 hs_err_pid3040.mdmp
	06/29/2018  07:31 AM            38,864 hs_err_pid3676.log
	06/29/2018  07:31 AM     7,986,923,546 hs_err_pid3676.mdmp
	02/12/2019  09:56 PM            38,322 hs_err_pid4224.log
	02/12/2019  09:56 PM     4,964,538,742 hs_err_pid4224.mdmp
	02/07/2019  10:18 AM            37,193 hs_err_pid4776.log
	02/07/2019  10:18 AM     3,177,549,366 hs_err_pid4776.mdmp
	01/17/2019  07:34 PM            38,874 hs_err_pid6772.log
	01/17/2019  07:34 PM     7,579,831,326 hs_err_pid6772.mdmp
	              16 File(s) 45,951,244,497 bytes

Also available as single ZIP file:

	02/14/2019  11:14 PM     5,241,734,239 tcnative-1.zip

All are consistent: EXCEPTION_ACCESS_VIOLATION (0xc0000005)

Uploading gigabytes is pointless unless a purpose is served so I'll start by asking if uploads are wanted.
Comment 26 Mark Thomas 2019-06-19 17:31:32 UTC
There are some HTTP/2 concurrency bugs fixed in later versions that may explain the crashes you have seen.

Testing with the latest stable 9.0.x and/or 8.5.x releases would be helpful.
Comment 27 Christopher Schultz 2019-08-22 16:56:40 UTC
(In reply to Conrad T. Pino from comment #25)
> Uploading gigabytes is pointless unless a purpose is served so I'll start by
> asking if uploads are wanted.

Uploading the crash log is useful. Uploading the memory and/or code dumps is not really useful. Feel free to upload the crash logs.
Comment 28 Conrad T. Pino 2019-08-22 17:28:46 UTC
Created attachment 36730 [details]
ZIP file - contents 8 plain text log files

Directory of hs_err_logs.zip

05/18/2018  15:20            38,952 hs_err_pid1964.log
11/07/2018  19:01            38,793 hs_err_pid2260.log
12/07/2018  18:24            37,242 hs_err_pid2440.log
01/13/2019  19:13            37,617 hs_err_pid3040.log
06/29/2018  08:31            38,864 hs_err_pid3676.log
02/12/2019  22:56            38,322 hs_err_pid4224.log
02/07/2019  11:18            37,193 hs_err_pid4776.log
01/17/2019  20:34            38,874 hs_err_pid6772.log
               8 File(s)        305,857 bytes
               0 Dir(s)  34,551,578,624 bytes free
Comment 29 Conrad T. Pino 2019-08-22 17:31:14 UTC
Comment on attachment 36730 [details]
ZIP file - contents 8 plain text log files

Directory of hs_err_logs.zip

05/18/2018  15:20            38,952 hs_err_pid1964.log
11/07/2018  19:01            38,793 hs_err_pid2260.log
12/07/2018  18:24            37,242 hs_err_pid2440.log
01/13/2019  19:13            37,617 hs_err_pid3040.log
06/29/2018  08:31            38,864 hs_err_pid3676.log
02/12/2019  22:56            38,322 hs_err_pid4224.log
02/07/2019  11:18            37,193 hs_err_pid4776.log
01/17/2019  20:34            38,874 hs_err_pid6772.log
               8 File(s)        305,857 bytes
               0 Dir(s)  34,551,578,624 bytes free
Comment 30 Conrad T. Pino 2019-08-22 17:33:26 UTC
(In reply to Christopher Schultz from comment #27)
> (In reply to Conrad T. Pino from comment #25)
> > Uploading gigabytes is pointless unless a purpose is served so I'll start by
> > asking if uploads are wanted.
> 
> Uploading the crash log is useful. Uploading the memory and/or code dumps is
> not really useful. Feel free to upload the crash logs.

ZIP file with 8 plain text log files attached. Thank you.
Comment 31 Michael Osipov 2020-04-06 13:29:18 UTC
Does this still occur with the most recent versions of Tomcat, libtcnative and libapr?
Comment 32 Mark Thomas 2020-08-20 15:48:52 UTC
No response for several months so I am going to assume the issue is resolved.

Most of the errors in the attached logs relate to HTTP/2 but not all. I can't tell if what is shown is a single error that affects HTTP/2 and HTTP/1.1 are two separate issues. I did note an HTTP/2 issue was fixed in 8.5.42 related to multiple threads accessing a single stream - similar scenarios have triggered APR/native crashes in the past.

Looking at the changelog, the refactoring to use a single pollset in 8.5.50 looks like a possible change that coudl have fixed these issues.

If you still see this issue or something similar please:
- update to the latest 8.5.x or 9.0.x release
- update to the latest Tomcat Native release
- retest
- if you still see the issue, feel free to re-open this bug

What we really need are the steps to reproduce it. Anything that narrows down
the trigger is helpful but a set of steps to reproduce is ideal.