Bug 69180 - Windows Tomcat service crashes in certain cases
Summary: Windows Tomcat service crashes in certain cases
Status: RESOLVED INVALID
Alias: None
Product: Tomcat 10
Classification: Unclassified
Component: Packaging (show other bugs)
Version: 10.1.25
Hardware: PC All
: P2 major (vote)
Target Milestone: ------
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-10 09:51 UTC by qooalt
Modified: 2024-07-23 13:39 UTC (History)
0 users



Attachments
Maven project demonstrating how to cause the problem (7.70 KB, application/x-zip-compressed)
2024-07-10 09:51 UTC, qooalt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description qooalt 2024-07-10 09:51:52 UTC
Created attachment 39799 [details]
Maven project demonstrating how to cause the  problem

Tomcat service in Windows crashes when certain java code is executed. This may be related to Commons Daemon in conjunction with specific java code. The problem does not only occur when the code attached is executed, but this is the only simple way I found to recreate it.

The same code works with Tomcat 10.1.24 (which uses Commons Daemon 1.3.4, if I'm not wrong).

Environment:

- Windows 11 (but also occurs with Windows Server 2019)
- JRE: IBM Semeru Runtime Open Edition (JDK) 17.0.7+7 (also occurs with 17.0.11+9)

Steps to reproduce:

1. Download 10.1.25 Windows Installer from https://tomcat.apache.org/download-10.cgi and install it.
2. Uncompress and build the attached maven project (helloworldsvg.zip).
3. Deploy the war file generated in step 2 onto the Tomcat installed in 1.
4. Visit http://localhost:8080/helloworldsvg

No significant messages are written in the log files, and the error shown in the even viewer is as follows:

Faulting application name: Tomcat10125.exe, version: 1.4.0.0, time stamp: 0x664770c7
Faulting module name: ntdll.dll, version: 10.0.22621.3733, time stamp: 0x67ca8829
Exception code: 0xc0000409
Fault offset: 0x000000000006d915
Faulting process id: 0x0xCF60
Faulting application start time: 0x0x1DAD29C07116CFB
Faulting application path: C:\Program Files\Apache Software Foundation\Tomcat 10.1_Tomcat10125\bin\Tomcat10125.exe
Faulting module path: C:\WINDOWS\SYSTEM32\ntdll.dll
Report Id: 59f71851-e580-428b-84ce-d1ac220970f4
Faulting package full name: 
Faulting package-relative application ID:
Comment 1 Christopher Schultz 2024-07-10 16:36:21 UTC
0xc0000409 is "stack buffer overrun" and ntdll.dll is in the Windows kernel.

So it's possible that procrun is handing bad data to the kernel, but it can't be very common that happens or we'd have reports of it all the time.

Does it fail every time? If so, we might be able to get to the bottom of it, especially if you can run a copy of procrun.exe with debug symbols and get a proper backtrace.

Did you get a full backtrace even if it's unreadable?

Are you using tcnative?

Have you run memtest86+ on your computer to make sure you don't have a hardware problem?
Comment 2 qooalt 2024-07-10 17:52:15 UTC
Does it fail every time? If so, we might be able to get to the bottom of it, especially if you can run a copy of procrun.exe with debug symbols and get a proper backtrace.

- With the webapp attached, it does reproduce every time.

- I just built procrun.exe from source with debug information, replaced Tomcat10.exe by the binary just built, and surprisingly it does not crash (both 1.4.0 tag and master),  so I can't tell. I didn't find the build configuration you use, so I couldn't recreate the exact replica, though.


Did you get a full backtrace even if it's unreadable?

- How can I get it?

Are you using tcnative?

- Not that I know. This is a default installation using the installer just downloaded from Tomcat's website, only changing the name of the service.

Have you run memtest86+ on your computer to make sure you don't have a hardware problem?

- The problem is reproducible in different machines, including a VM, so I don't think it's related to a hardware malfunction.
Comment 3 qooalt 2024-07-11 07:02:51 UTC
(In reply to qooalt from comment #2)
> Does it fail every time? If so, we might be able to get to the bottom of it,
> especially if you can run a copy of procrun.exe with debug symbols and get a
> proper backtrace.
> 
> - With the webapp attached, it does reproduce every time.
> 
> - I just built procrun.exe from source with debug information, replaced
> Tomcat10.exe by the binary just built, and surprisingly it does not crash
> (both 1.4.0 tag and master),  so I can't tell. I didn't find the build
> configuration you use, so I couldn't recreate the exact replica, though.
> 
> 
> Did you get a full backtrace even if it's unreadable?
> 
> - How can I get it?
> 
> Are you using tcnative?
> 
> - Not that I know. This is a default installation using the installer just
> downloaded from Tomcat's website, only changing the name of the service.
> 
> Have you run memtest86+ on your computer to make sure you don't have a
> hardware problem?
> 
> - The problem is reproducible in different machines, including a VM, so I
> don't think it's related to a hardware malfunction.

Just a correction in first point, I meant prunsrv.exe
Comment 4 qooalt 2024-07-11 07:08:10 UTC
(In reply to Christopher Schultz from comment #1)
> 0xc0000409 is "stack buffer overrun" and ntdll.dll is in the Windows kernel.
> 
> So it's possible that procrun is handing bad data to the kernel, but it
> can't be very common that happens or we'd have reports of it all the time.
> 
> Does it fail every time? If so, we might be able to get to the bottom of it,
> especially if you can run a copy of procrun.exe with debug symbols and get a
> proper backtrace.
> 
> Did you get a full backtrace even if it's unreadable?
> 
> Are you using tcnative?
> 
> Have you run memtest86+ on your computer to make sure you don't have a
> hardware problem?

I just realized I didn't reply to your comment, apologies. Please, see my next comment.
Comment 5 Christopher Schultz 2024-07-11 15:51:39 UTC
The information you provided does seem to point to a bug in commons-daemon. We only package it, we don't develop it. We may need to re-assign this bug report to commons-daemon.

In the meantime... have you tried to build 1.3.8 from source and use that? The original crash reports the version of Tomcat1025.exe is 1.4.0.0 so I suspect it's ... not 1.3.8. Can you confirm?

If you build WITHOUT debug symbols... does it crash?
Comment 6 qooalt 2024-07-11 16:26:20 UTC
(In reply to Christopher Schultz from comment #5)
> The information you provided does seem to point to a bug in commons-daemon.
> We only package it, we don't develop it. We may need to re-assign this bug
> report to commons-daemon.

Yes, that seems clear now. When I reported the bug I hadn't yet looked at how you packaged it.

> 
> In the meantime... have you tried to build 1.3.8 from source and use that?

No, I built 1.4.0

> The original crash reports the version of Tomcat1025.exe is 1.4.0.0 so I
> suspect it's ... not 1.3.8. Can you confirm?

It's not 1.3.8

> If you build WITHOUT debug symbols... does it crash?

Yes, just checked this morning. I'd bet this issue is related to this commit https://github.com/apache/commons-daemon/commit/fed36896cbac1c7b524a047f4e973228d2d41ab7
Comment 7 qooalt 2024-07-11 16:28:09 UTC
BTW, I requested access to ASF Commons-daemon JIRA project, so I still can't report issues there.
Comment 8 Christopher Schultz 2024-07-12 14:43:27 UTC
So that commit seems to create the problem, but only with non-debug builds?

That's quite good information to report. Thanks for your responsiveness and especially your ability to actually build and try something on Windows. :)
Comment 9 qooalt 2024-07-12 18:11:09 UTC
(In reply to Christopher Schultz from comment #8)
> So that commit seems to create the problem, but only with non-debug builds?

Yes. If you look at that commit, they only apply the Control Flow Guard flag to release builds.


> That's quite good information to report. Thanks for your responsiveness and
> especially your ability to actually build and try something on Windows. :)

Unfortunately, I still haven't been granted access to Commons-Daemon JIRA project, so I can't report it there.
Comment 10 Mark Thomas 2024-07-15 17:17:49 UTC
Your Jira account was approved ~3 hours after you requested it.
Comment 11 qooalt 2024-07-16 06:42:49 UTC
You are right, I missed it.
Comment 12 Mark Thomas 2024-07-16 07:53:47 UTC
Thanks for creating the DAEMON issue and for all the debugging you have done.

I'll leave this issue open to track updating Tomcat to a fixed version of DAEMON.
Comment 13 qooalt 2024-07-16 14:09:42 UTC
(In reply to Mark Thomas from comment #12)
> Thanks for creating the DAEMON issue and for all the debugging you have done.
> 
> I'll leave this issue open to track updating Tomcat to a fixed version of
> DAEMON.

Thank you
Comment 14 Mark Thomas 2024-07-16 16:46:14 UTC
Testing shows that this appears to be an issue specific to the IBM JRE.