Tomcat 9.0.14 shutdown.sh produces stack trace, non 0 exit code. startup.sh not releasing port 8005? Tomcat 9.0.13 and all previous versions start and stop normally. Dec 28, 2018 9:57:32 PM org.apache.catalina.startup.Catalina stopServer SEVERE: Error stopping Catalina java.net.ConnectException: Connection refused (Connection refused) at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399) at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242) at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224) at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403) at java.base/java.net.Socket.connect(Socket.java:591) at java.base/java.net.Socket.connect(Socket.java:540) at java.base/java.net.Socket.<init>(Socket.java:436) at java.base/java.net.Socket.<init>(Socket.java:213) at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:513) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:403) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:497) Environment: Server version name: Apache Tomcat/9.0.14 OS Name: Linux OS Version: 4.15.0-43-generic Architecture: amd64 JVM Version: 11.0.1+13 Tomcat Native library [1.2.19] using APR version [1.6.5] How to reproduce: Download and install Tomcat 9.0.14. Start it using startup.sh. Stop it using shutdown.sh. Stack trace observed on Linux and MacOS. There are no user webapps involved.
The shutdown process is unable to connect to a running Tomcat server over the expected port. This is likely to be a configuration issue. Bugzilla is not a support forum. Please post to the Tomcat users' mailing list, instead.
Tomcat 9.0.14 runs as expected using a transplanted 9.0.13 catalina.sh. It starts, stops and restarts using systemd (systemctl start|stop|restart tomcat). When the server is running netstat shows: tcp6 0 0 :::8080 :::* LISTEN 1038/java tcp6 0 0 127.0.0.1:8005 :::* LISTEN 1038/java tcp6 0 0 :::8009 :::* LISTEN 1038/java unix 2 [ ] STREAM CONNECTED 21567 1038/java OTOH, using the 9.0.14 version of catalina.sh, and after a fresh system reboot, Tomcat starts normally (similar netstat results) until the first stop. Any subsequent start results in a start/stop loop where the server is running only momentarily during each 10s restart loop. Configuration is default. There are no user webapps nor changes to the default conf/server.xml file.
(In reply to Steve Demy from comment #2) > Tomcat 9.0.14 runs as expected using a transplanted 9.0.13 catalina.sh. It > starts, stops and restarts using systemd (systemctl start|stop|restart > tomcat). So if you take catalina.sh from 9.0.13 it works fine but with catalina.sh 9.0.14 you experience this issue? All other factors are the same?
I will look into this issue if no one else has already started
On the users' mailing list, I asked Steve to try the catalina.sh from 9.0.13 on his 9.0.14 install, but there has been no reply for a few days.
(In reply to Christopher Schultz from comment #5) > On the users' mailing list, I asked Steve to try the catalina.sh from 9.0.13 > on his 9.0.14 install, but there has been no reply for a few days. He wrote this in comment #2, possibly replying in the "wrong" thread: > Tomcat 9.0.14 runs as expected using a transplanted 9.0.13 catalina.sh At first glance it looks like a regression from BZ-53930, which made quite a few changes to catalina.sh, but I will dig deeper and update with my findings.
Possibly related to BZ-63063
Yes, BZ-63063 is exactly the same issue opened for tomcat 8.5 branch. In tomcat 9.0 the catalina.sh contains at the same line 479 and 489 that faulty code: 2\>\&1 \& echo \$! \>\"$catalina_pid_file\" \; \} $catalina_out_command "&" it needs to be delimitered: 2\>\&1 \; echo \$! \>\"$catalina_pid_file\" \; \} $catalina_out_command "&" or logical AND: 2\>\&1 \&\& echo \$! \>\"$catalina_pid_file\" \; \} $catalina_out_command "&"
I can not reproduce the issue on Ubuntu 18.04 with Java 1.8.0_192 Tomcat shuts down immediately when calling shutdown.sh
The issue isn't obvious but is observable on Ubuntu. I agree that "\&" -> to "\&\&" is the correct fix. I'll get that implemented and back-ported.
*** Bug 63063 has been marked as a duplicate of this bug. ***
Thanks for the report and to Patrik S. for the analysis. This has been fixed in: - trunk for 9.0.15 onwards - 8.5.x for 8.5.38 onwards - 7.0.x for 7.0.93 onwards (note the bug was never present in a released version of 7.0.x)
(In reply to Mark Thomas from comment #10) > The issue isn't obvious but is observable on Ubuntu. > > I agree that "\&" -> to "\&\&" is the correct fix. I'll get that implemented > and back-ported. This fix seems wrong. $! does not work after &&, only after &. Thus, the correct PID cannot be obtained. Unfortunately, && and & mean something completely different. You can see this if you call "true & echo $!" versus "true && echo $!" several times. The first one will always give different PIDs while the second one will always give the same PID (may also be empty if nothing else was run in the background before). If I understand you correctly, you want to have the PID of nohup and you want to write no PID file if nohup fails? I think this is non-trivial, because in one case, the process is still running while in the other case it isn't. All workarounds that came to my mind are very ugly, so I cannot provide a patch. For me, the old catalina.sh works smoother than the new one.
This fix seems to break the shell script /etc/init.d/tomcat I use to manage Tomcat as a service with systemd (generated from init.d): https://git.io/fhQv4 Output follows: $ sudo service tomcat start Job for tomcat.service failed because the control process exited with error code. See "systemctl status tomcat.service" and "journalctl -xe" for details. $ systemctl status tomcat.service ● tomcat.service - LSB: Start Tomcat. Loaded: loaded (/etc/init.d/tomcat; generated; vendor preset: enabled) Active: failed (Result: exit-code) since Sun 2019-02-10 21:11:07 CET; 7s ago Docs: man:systemd-sysv-generator(8) Process: 4938 ExecStop=/etc/init.d/tomcat stop (code=exited, status=0/SUCCESS) Process: 5066 ExecStart=/etc/init.d/tomcat start (code=exited, status=1/FAILURE) CGroup: /system.slice/tomcat.service ├─5094 /bin/sh /opt/tomcat/bin/catalina.sh start └─5095 /usr/lib/jvm/java-8-oracle/bin/java -Djava.util.logging.config.file=/opt/tomcat/conf/logging.properties Feb 10 21:11:02 raspberrypi systemd[1]: Starting LSB: Start Tomcat.... Feb 10 21:11:07 raspberrypi tomcat[5066]: Starting Tomcat servlet engine: tomcat failed! Feb 10 21:11:07 raspberrypi systemd[1]: tomcat.service: Control process exited, code=exited status=1 Feb 10 21:11:07 raspberrypi systemd[1]: Failed to start LSB: Start Tomcat.. Feb 10 21:11:07 raspberrypi systemd[1]: tomcat.service: Unit entered failed state. Feb 10 21:11:07 raspberrypi systemd[1]: tomcat.service: Failed with result 'exit-code'. $ journalctl -xe Feb 10 21:11:02 raspberrypi systemd[1]: Starting LSB: Start Tomcat.... -- Subject: Unit tomcat.service has begun start-up -- Defined-By: systemd -- Support: https://www.debian.org/support -- -- Unit tomcat.service has begun starting up. Feb 10 21:11:07 raspberrypi tomcat[5066]: Starting Tomcat servlet engine: tomcat failed! Feb 10 21:11:07 raspberrypi systemd[1]: tomcat.service: Control process exited, code=exited status=1 Feb 10 21:11:07 raspberrypi systemd[1]: Failed to start LSB: Start Tomcat.. -- Subject: Unit tomcat.service has failed -- Defined-By: systemd -- Support: https://www.debian.org/support -- -- Unit tomcat.service has failed. -- -- The result is failed. Feb 10 21:11:07 raspberrypi systemd[1]: tomcat.service: Unit entered failed state. Feb 10 21:11:07 raspberrypi systemd[1]: tomcat.service: Failed with result 'exit-code'. That /etc/init.d/tomcat script is a slightly modified version of the one accompanying the Tomcat 8 package distributed with Debian APT. It's worth noting that the server eventually starts despite the error, but I'm not able to manage it as a service afterwards. Now I'm using the old catalina.sh from v9.0.14 that works fine, but I'm considering whether it is better to modify my script in order to support the new catalina.sh or not. Thanks for your great work.
(In reply to norad from comment #14) This is the same problem I observed. > Now I'm using the old catalina.sh from v9.0.14 that works fine, but I'm > considering whether it is better to modify my script in order to support the > new catalina.sh or not. As far as I can see, the fix is just wrong as no PID file is written. I think you should not modify anything. This should be fixed in catalina.sh. In my Ubuntu PPA, I now use the old catalina.sh and it is working.
Sigh. Sorry for the hassle. Sticking with catalina.sh from 9.0.14 is the way to go if you are seeing issues. We should be able to get this fixed for the next release. I don't know yet what that fix will look like.
Another report of this issue on the users@ list. https://markmail.org/message/x4rzdpsjzoi2pbdd Quoting: --- I'm observing this on RHEL 6 and 7. RHEL6$ /bin/sh --version GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu) RHEL7$ /bin/sh --version GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu) [...] The 8.5.37 version is storing a PID value, but it is not the correct PID. The 8.5.38 version does not even create the file. I can also see this same behavior on Ubuntu 18.04.1 LTS. ---
For a reference: POSIX-2018 documentation on Shell Command Language: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html The $! parameter is defined in section "2.5.2 Special Parameters" of that document. "$!" returns a PID of a background command, but '&&' separates foreground commands, and it will wait until the first command completes. Thus no PID file is created. I also wonder how nohup command fits into all this (triggered via $USE_NOHUP and represented by $_NOHUP here).
An idea: use a named pipe (a FIFO special file) to solve the original bug 53930. 1. mkfifo command is documented in POSIX. It is a standard feature. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/mkfifo.html 2. No need to change syntax. Redirection to a FIFO file is the same as to a regular CATALINA_OUT file. You just need to create it beforehand. Articles: https://www.linuxjournal.com/article/2156 https://en.wikipedia.org/wiki/Named_pipe
(In reply to Konstantin Kolinko from comment #19) > An idea: use a named pipe (a FIFO special file) to solve the original bug > 53930. If I am understanding this correctly that would mean: - reverting r1848046 and r1850829 for 9.0.x - reverting r1848048 and r1850830 for 8.5.x - reverting r1848049 and r1850831 for 7.0.x Re-opening bug 53930 and resolving it as WONTFIX - suggesting that a named pipe is used. This is consistent with the work-around of reverting to an earlier version of catalina.sh. Assuming my understanding is correct, do we view this as sufficient to warrant a new 9.0.x and 8.5.x release?
> If I am understanding this correctly that would mean: > > - reverting r1848046 and r1850829 for 9.0.x > - reverting r1848048 and r1850830 for 8.5.x > - reverting r1848049 and r1850831 for 7.0.x That should at least fix the actual trouble. > Re-opening bug 53930 and resolving it as WONTFIX - suggesting that a named > pipe is used. I'm not an expert on named pipes. The documentation of mkfifo says: "However, it has to be open at both ends simultaneously before you can proceed to do any input or output operations on it. Opening a FIFO for reading normally blocks until some other process opens the same FIFO for writing, and vice versa." I don't know whether this is ok or not. I don't need the new feature from bug 53930 at all. > Assuming my understanding is correct, do we view this as sufficient to > warrant a new 9.0.x and 8.5.x release? Currently, I don't need a new version (I just use the old catalina.sh). But it might still be a good idea, because other people might run into this issue. I don't know how much work a new release really is. Best regards, Thomas
Do we actually understand, what would be wrong with the line 2\>\&1 \& echo \$! \>\"$catalina_pid_file\" \; \} $catalina_out_command "&" (single escaped ampersand)? As far as I understand from this ticket here, there was an issue about not being able to shut down. Does this happen only in combination with systemd? I can not reproduce locally with normal shell start/stop. Wo ever tests this: did you set CATALINA_PID to some reasonable value, eg. to logs/catalina.pid?
(In reply to Rainer Jung from comment #22) > Do we actually understand, what would be wrong with the line > > 2\>\&1 \& echo \$! \>\"$catalina_pid_file\" \; \} $catalina_out_command "&" > > (single escaped ampersand)? I don't really know what the problem is, but it might be that this writes a PID even if nohup fails. But this is just a guess. And this problem should also appear with the old version before bug 53930. For me, this line does not really look wrong. Still, I don't like the process timing in this line. I'm asking myself whether we could omit the "&" at the end of the line. The first \& should already put the process into the background, shouldn't it? Best regards, Thomas
(In reply to Rainer Jung from comment #22) > Do we actually understand, what would be wrong with the line > > 2\>\&1 \& echo \$! \>\"$catalina_pid_file\" \; \} $catalina_out_command "&" > > (single escaped ampersand)? If I add set -x to catalina.sh and then execute the full command line that results I see: bash: syntax error near unexpected token `}' My guess is that systemd sees this error and assumes Tomcat failed to start properly. I've now got a test environment set up for this with systemd and 9.0.x HEAD. I'm going to work on a fix but if anyone with better bash foo than me wants to make a suggestion, I'm happy to try it.
(In reply to Mark Thomas from comment #24) > bash: syntax error near unexpected token `}' I think this is just incorrect escaping of the command and could be a red herring.
Got it. With the single & the pid of the process that is put in the background is not the pid of the Tomcat process. If I specify $CATALINA_PID and then configure systemd to use the same file with PIDFILE everything starts working. That final "&" prevents the eval command being written to the console when using catalina.sh start Assuming the above is correct I think we have two options: 1. Revert the "&" -> "&&" change. systemd users (and possibly others) will be required to use a PID file and to tell systemd where to find that file. 2. Revert both changes. systemd users will have to take no action. Users wanting to redirect to a command will have to use a named pipe. I'm leaning towards 2 on the grounds this has the least impact on the smallest number of users. This assumes that the number of systemd users is greater than the number of users wanting to redirect to a command.
Excellent analysis! I *think* the "&" at the end of the line is only useful, if a pipe command is actually being used. So one could put it into the construction of the string variable to which we pipe, just as we prefix the command with "|" we could suffix it with "&". Nevertheless systemd will probably have trouble identifying the backgrounded Java process. When using systemd in our own distribution, we include Type=forking ... PIDFile=/path/to/my/logs/catalina.pid ExecStart=/path/to/catalina_home/bin/catalina.sh start ExecStop=/path/to/catalina_home/bin/catalina.sh stop in our service description file (or whatever CATALINA_PID is). So it might make sense to better support systemd by providing a template for this. Using the PIDFile, systemd should be able to correctly detect, whether Tomcat is running or not (once we are writing te correct pid to it). As an example, the full file is the following, where all XXX params are patched before copying the file to systemd during running a custom "service-install" script target. [Unit] RequiresMountsFor=XXXKPDT_CATALINA_HOMEXXX XXXKPDT_CATALINA_BASEXXX SourcePath=XXXKPDT_CATALINA_BASEXXX/bin/tomcat-SERVICE.systemd After=network.target nss-lookup.target [Service] Type=forking Restart=on-failure # Disable OOM killer for this service OOMScoreAdjust=-1000 Environment=CATALINA_HOME=XXXKPDT_CATALINA_HOMEXXX Environment=CATALINA_BASE=XXXKPDT_CATALINA_BASEXXX Environment=TC_USER=XXXKPDT_TC_USERXXX Environment=JAVA_HOME=XXXKPDT_JAVA_HOMEXXX User=XXXKPDT_TC_USERXXX PIDFile=XXXKPDT_CATALINA_PIDXXX ExecStart=XXXKPDT_CATALINA_HOMEXXX/bin/catalina.sh start ExecStop=XXXKPDT_CATALINA_HOMEXXX/bin/catalina.sh stop [Install] WantedBy=multi-user.target
(In reply to Rainer Jung from comment #27) > Excellent analysis! > > I *think* the "&" at the end of the line is only useful, if a pipe command > is actually being used. So one could put it into the construction of the > string variable to which we pipe, just as we prefix the command with "|" we > could suffix it with "&". > Interesting. A good idea. (In reply to Mark Thomas from comment #26) > > That final "&" prevents the eval command being written to the console when > using catalina.sh start So it could be >/dev/null ? (In reply to Mark Thomas from comment #26) > Assuming the above is correct I think we have two options: > > 1. Revert the "&" -> "&&" change. systemd users (and possibly others) will > be required to use a PID file and to tell systemd where to find that file. > 1. I wonder how OP's configuration with systemd was written. Is it some home-grown configuration (and can be fixed in place), or it has to be fixed somewhere upstream. Personally, I always use a PID file. Thank you Rainer Jung for sharing your configuration. > PIDFile=XXXKPDT_CATALINA_PIDXXX If the PID file is created by catalina.sh, I guess one has to set CATALINA_PID somewhere (e.g. with 'Environment=' line). 2. I am not sure how the '&' recipe plays with nohup. It might work. I just fear that nohup might create a nohup.out file. (It sends "nohup java" process into background without redirecting its output. The nohup.out file is created if nohup thinks that its output is a terminal. Is it able to detect redirection of a compound command?) Documentation: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/nohup.html (In reply to Mark Thomas from comment #20) > (In reply to Konstantin Kolinko from comment #19) > > An idea: use a named pipe (a FIFO special file) to solve the original bug > > 53930. > > If I am understanding this correctly that would mean: > > - reverting r1848046 and r1850829 for 9.0.x > - reverting r1848048 and r1850830 for 8.5.x > - reverting r1848049 and r1850831 for 7.0.x > > Re-opening bug 53930 and resolving it as WONTFIX - suggesting that a named > pipe is used. I am not sure whether configuring a pipe externally will work. It is just that if one uses the pipe, we can keep the original "eval" line, and the overall patch is less intrusive. Support for setting up the pipe could be in catalina.sh. It needs some thought and some testing. 1) I wonder how 'touch "$CATALINA_OUT"' command will interact with a pipe. Maybe it has to be skipped if the file exists. 2) catalina.sh has code that checks existing PID file to prevent double starts. If the code that manages the pipe is places after the checks, it can benefit from them. 3) I wonder whether it is better to start the process reading from pipe before or after Tomcat. I guess that with a usual (non-named) pipe it is started after Tomcat. I've read that a process writing to a pipe will hang if there is no one reading. My preference is to revert and to start planning a new patch from there.
(In reply to Konstantin Kolinko from comment #28) > My preference is to revert and to start planning a new patch from there. +1 We can re-open bug 53930 and discuss options there. Calling touch on a named pipe just updates the last modified date. Yes, a process will hang if writing to a pipe and the reading process stops but I don't see much difference between the named pipe approach and the unnamed pipe approach if the user wants output piped to a command. Both will hang if the destination doesn't read the data fast enough.
Fixed in: - trunk for 9.0.17 onwards - 8.5.x for 8.5.38 onwards - 7.0.x for 7.0.93 onwards Note: The regression never made it into a 7.0.x release
(In reply to Mark Thomas from comment #26) (for the record) > Assuming the above is correct I think we have two options: > > 1. Revert the "&" -> "&&" change. systemd users (and possibly others) will > be required to use a PID file and to tell systemd where to find that file. > > 2. Revert both changes. systemd users will have to take no action. Users > wanting to redirect to a command will have to use a named pipe. +1 on option 2 pid-files are anathema to systemd, so let's do them a favor. Also, logging huge amounts of data to an unrotatable file is something users have to swallow if they want to dump huge amounts of data to it. Finally, using a named-pipe is possible with the original startup script, so we aren't taking anything away from anyone. > I'm leaning towards 2 on the grounds this has the least impact on > the smallest number of users. This assumes that the number of > systemd users is greater than the number of users wanting to > redirect to a command. +1
*** Bug 63183 has been marked as a duplicate of this bug. ***
*** Bug 63244 has been marked as a duplicate of this bug. ***