We have set CATALINA_PID in setenv.sh. After the machine is reboot, The PID file is still there and tomcat fails to start. The error message is: "Existing PID file found during start. Tomcat appears to still be running with PID 3387. Start aborted." After checking, there is another process have the same PID: 3387. After checking the code catalina.sh, the following logic has issue: if [ ! -z "$CATALINA_PID" ]; then if [ -f "$CATALINA_PID" ]; then if [ -s "$CATALINA_PID" ]; then echo "Existing PID file found during start." if [ -r "$CATALINA_PID" ]; then PID=`cat "$CATALINA_PID"` ps -p $PID >/dev/null 2>&1 if [ $? -eq 0 ] ; then echo "Tomcat appears to still be running with PID $PID. Start aborted." exit 1 else Tomcat should not treat tomcat is still alive. The script should work anyway.
Either you should integrate your Tomcat stop (and probably start) into your system shutdown/startup (rc scripts or whatever methodology your system uses) or you rely on doing it by hand. In the latter case, Tomcat does not get any info about the system shutdown and can not react on it. Trying to find out whether the found process after reboot actually is a Tomcat process or something else is not the task of the start script.Integrating it will be error prone and hard to maintain cross platform. If you start by hand and get the cited error, you need to check the other process (like you did) and if it is something else and Tomcat is not running, purge the old PID file. We could probably make the message "Existing PID file found during start. Tomcat appears to still be running with PID 3387. Start aborted." a bit better "Existing PID file found during start. Tomcat appears to still be running with PID 3387. Start aborted. If the process with PID 3387 is not a Tomcat process, remove the PID file NAME_OF_PID_FILE_HERE and try again."
I added the message "If this process is not a Tomcat process, remove the PID file and try again." to the output. The name of the PID file is already being output earlier during the script run. Added to trunk in r1672272, tc 8 in r1672273 (will be part of 8.0.22), tc 7 in r1672274 (will be part of 7.0.62) and proposed for TC 6.
Also added "ps" output for the process with the PID in r1672284 (trunk), r1672285 (tc8) and r1672286 (tc7).
Thanks Rainer. If the machine is shutdown by power off, the rc script may don't have chance to execute. After use ps to check if the PID is alive, could you also extract the path of the process, and compare it with the home path of tomcat? If the path is the same, the process should be tomcat, otherwise, it's other process. Then, the script could rm the PID file and continue to start. Sometimes, the tomcat process is just hung up. could you provide a force start option? Even the tomcat process is there, just kill it and start anyway.
(In reply to jiaoyk from comment #4) > Thanks Rainer. > > If the machine is shutdown by power off, the rc script may don't have chance > to execute. OK, but that's a really exceptional case. Then you might have the same problem with lots of unix daemons. Either they ignore the PID file, or they don't start. > After use ps to check if the PID is alive, could you also extract the path > of the process, and compare it with the home path of tomcat? If the path is > the same, the process should be tomcat, otherwise, it's other process. Then, > the script could rm the PID file and continue to start. I doubt, that this is possible in a platform independent but maintainable way. The script is used on lots of platforms, like various Linuxes, BSD, Solaris, Cygwin, OS-X, and probably AIX, HP-UX, etc. etc. Some of these platforms do not provide the full process command using "ps" but they truncate it. IMHO there is no platform independent way to retrieve all process args, e.g. the -Dcatalina.base=... that the script sets. I don't plan to invest more into this, because it happens very rarely and the solution will be fragile. Automatic problem resolution needs to be robust, otherwise it triggers more problems than it solves. If anyone likes to tackle this, patches will be welcome, but must be multi-platform. > Sometimes, the tomcat process is just hung up. could you provide a force > start option? Even the tomcat process is there, just kill it and start > anyway. We could support the existing"-force" for "start" as well and let -force ignore any PID file problems. An existing other process is only one such problem. There are more cases where the script currently aborts. Do you think all these cases should be ignored with -force? Please have a look at "abort" in bin/catalina.sh.
(In reply to Rainer Jung from comment #5) > (In reply to jiaoyk from comment #4) > > Thanks Rainer. > > > > If the machine is shutdown by power off, the rc script may don't have chance > > to execute. > > OK, but that's a really exceptional case. Then you might have the same > problem with lots of unix daemons. Either they ignore the PID file, or they > don't start. Or they a) Put the PID file in ephemeral storage (ramdisk) b) Put the PID file in /tmp, which should be emptied on boot c) Otherwise arrange to have their PID files removed on boot
Thanks Rainer and Christopher It's not worth to set up a ramdisk to store the PID file. If the PID file is stored in /tmp, the PID file may be rm-ed by someone. The script should work in worst case even it's rare. The script should not depend on the last state, it should be stateless. The key point is catalina.sh treat the wrong process to be tomcat process. Do we really need PID file to save the PID? If we have issues with cross platform to verify the home path, maybe we could first get the home path of the tomcat and then use this path to grep the right PID in the result of ps(assuming the key commands in the following functions exist in multiple unix/linux like OS). Maybe we could get the PID of tomcat by some function like the following? function get_tomcat_pid() { declare NORMALIZED=$(echo $CATALINA_HOME | tr -s / /) declare NORMALIZED_PATH=$(readlink -f $CATALINA_HOME) if [ "$NORMALIZED" != "$NORMALIZED_PATH" ]; then NORMALIZED=$NORMALIZED_PATH fi if [ -z "$NORMALIZED" -a "${NORMALIZED+x}" = "x" ] ; then return 1 fi declare pid=`ps -ef | grep $NORMALIZED | grep -v grep | awk '{print $2}'` if [ -z "$pid" -a "${pid+x}" = "x" ] ; then return 1 fi echo $pid } Thanks for supporting the force start. The "abort" cases such as can't remove or write the PID file should be abort, it looks that it does't have permission. Maybe user use the wrong user to run the process. The "abort" cases such as "PID file found but no matching process was found. Stop aborted." , "$CATALINA_PID was set but the specified file does not exist." should be a warning.
(In reply to Rainer Jung from comment #3) > Also added "ps" output for the process with the PID in r1672284 (trunk), > r1672285 (tc8) and r1672286 (tc7). Backported to Tomcat 6 in r1678326 and will be in 6.0.44 onwards.
This issue is as fixed as it is going to get. Using /tmp is the way to. The OS will set appropriate permissions so only root and the user Tomcat is running as can delete the file.