Created attachment 39799 [details] Maven project demonstrating how to cause the problem Tomcat service in Windows crashes when certain java code is executed. This may be related to Commons Daemon in conjunction with specific java code. The problem does not only occur when the code attached is executed, but this is the only simple way I found to recreate it. The same code works with Tomcat 10.1.24 (which uses Commons Daemon 1.3.4, if I'm not wrong). Environment: - Windows 11 (but also occurs with Windows Server 2019) - JRE: IBM Semeru Runtime Open Edition (JDK) 17.0.7+7 (also occurs with 17.0.11+9) Steps to reproduce: 1. Download 10.1.25 Windows Installer from https://tomcat.apache.org/download-10.cgi and install it. 2. Uncompress and build the attached maven project (helloworldsvg.zip). 3. Deploy the war file generated in step 2 onto the Tomcat installed in 1. 4. Visit http://localhost:8080/helloworldsvg No significant messages are written in the log files, and the error shown in the even viewer is as follows: Faulting application name: Tomcat10125.exe, version: 1.4.0.0, time stamp: 0x664770c7 Faulting module name: ntdll.dll, version: 10.0.22621.3733, time stamp: 0x67ca8829 Exception code: 0xc0000409 Fault offset: 0x000000000006d915 Faulting process id: 0x0xCF60 Faulting application start time: 0x0x1DAD29C07116CFB Faulting application path: C:\Program Files\Apache Software Foundation\Tomcat 10.1_Tomcat10125\bin\Tomcat10125.exe Faulting module path: C:\WINDOWS\SYSTEM32\ntdll.dll Report Id: 59f71851-e580-428b-84ce-d1ac220970f4 Faulting package full name: Faulting package-relative application ID:
0xc0000409 is "stack buffer overrun" and ntdll.dll is in the Windows kernel. So it's possible that procrun is handing bad data to the kernel, but it can't be very common that happens or we'd have reports of it all the time. Does it fail every time? If so, we might be able to get to the bottom of it, especially if you can run a copy of procrun.exe with debug symbols and get a proper backtrace. Did you get a full backtrace even if it's unreadable? Are you using tcnative? Have you run memtest86+ on your computer to make sure you don't have a hardware problem?
Does it fail every time? If so, we might be able to get to the bottom of it, especially if you can run a copy of procrun.exe with debug symbols and get a proper backtrace. - With the webapp attached, it does reproduce every time. - I just built procrun.exe from source with debug information, replaced Tomcat10.exe by the binary just built, and surprisingly it does not crash (both 1.4.0 tag and master), so I can't tell. I didn't find the build configuration you use, so I couldn't recreate the exact replica, though. Did you get a full backtrace even if it's unreadable? - How can I get it? Are you using tcnative? - Not that I know. This is a default installation using the installer just downloaded from Tomcat's website, only changing the name of the service. Have you run memtest86+ on your computer to make sure you don't have a hardware problem? - The problem is reproducible in different machines, including a VM, so I don't think it's related to a hardware malfunction.
(In reply to qooalt from comment #2) > Does it fail every time? If so, we might be able to get to the bottom of it, > especially if you can run a copy of procrun.exe with debug symbols and get a > proper backtrace. > > - With the webapp attached, it does reproduce every time. > > - I just built procrun.exe from source with debug information, replaced > Tomcat10.exe by the binary just built, and surprisingly it does not crash > (both 1.4.0 tag and master), so I can't tell. I didn't find the build > configuration you use, so I couldn't recreate the exact replica, though. > > > Did you get a full backtrace even if it's unreadable? > > - How can I get it? > > Are you using tcnative? > > - Not that I know. This is a default installation using the installer just > downloaded from Tomcat's website, only changing the name of the service. > > Have you run memtest86+ on your computer to make sure you don't have a > hardware problem? > > - The problem is reproducible in different machines, including a VM, so I > don't think it's related to a hardware malfunction. Just a correction in first point, I meant prunsrv.exe
(In reply to Christopher Schultz from comment #1) > 0xc0000409 is "stack buffer overrun" and ntdll.dll is in the Windows kernel. > > So it's possible that procrun is handing bad data to the kernel, but it > can't be very common that happens or we'd have reports of it all the time. > > Does it fail every time? If so, we might be able to get to the bottom of it, > especially if you can run a copy of procrun.exe with debug symbols and get a > proper backtrace. > > Did you get a full backtrace even if it's unreadable? > > Are you using tcnative? > > Have you run memtest86+ on your computer to make sure you don't have a > hardware problem? I just realized I didn't reply to your comment, apologies. Please, see my next comment.
The information you provided does seem to point to a bug in commons-daemon. We only package it, we don't develop it. We may need to re-assign this bug report to commons-daemon. In the meantime... have you tried to build 1.3.8 from source and use that? The original crash reports the version of Tomcat1025.exe is 1.4.0.0 so I suspect it's ... not 1.3.8. Can you confirm? If you build WITHOUT debug symbols... does it crash?
(In reply to Christopher Schultz from comment #5) > The information you provided does seem to point to a bug in commons-daemon. > We only package it, we don't develop it. We may need to re-assign this bug > report to commons-daemon. Yes, that seems clear now. When I reported the bug I hadn't yet looked at how you packaged it. > > In the meantime... have you tried to build 1.3.8 from source and use that? No, I built 1.4.0 > The original crash reports the version of Tomcat1025.exe is 1.4.0.0 so I > suspect it's ... not 1.3.8. Can you confirm? It's not 1.3.8 > If you build WITHOUT debug symbols... does it crash? Yes, just checked this morning. I'd bet this issue is related to this commit https://github.com/apache/commons-daemon/commit/fed36896cbac1c7b524a047f4e973228d2d41ab7
BTW, I requested access to ASF Commons-daemon JIRA project, so I still can't report issues there.
So that commit seems to create the problem, but only with non-debug builds? That's quite good information to report. Thanks for your responsiveness and especially your ability to actually build and try something on Windows. :)
(In reply to Christopher Schultz from comment #8) > So that commit seems to create the problem, but only with non-debug builds? Yes. If you look at that commit, they only apply the Control Flow Guard flag to release builds. > That's quite good information to report. Thanks for your responsiveness and > especially your ability to actually build and try something on Windows. :) Unfortunately, I still haven't been granted access to Commons-Daemon JIRA project, so I can't report it there.
Your Jira account was approved ~3 hours after you requested it.
You are right, I missed it.
Thanks for creating the DAEMON issue and for all the debugging you have done. I'll leave this issue open to track updating Tomcat to a fixed version of DAEMON.
(In reply to Mark Thomas from comment #12) > Thanks for creating the DAEMON issue and for all the debugging you have done. > > I'll leave this issue open to track updating Tomcat to a fixed version of > DAEMON. Thank you
Testing shows that this appears to be an issue specific to the IBM JRE.