|Summary:||If mod_jk cannot resolve host name of single worker, all workers are destroyed|
|Product:||Tomcat Connectors||Reporter:||Jerrold Poh <jerrold.poh>|
|Component:||Common||Assignee:||Tomcat Developers Mailing List <dev>|
Description Jerrold Poh 2007-07-10 17:13:28 UTC
In the workers definition file (specified by the JkWorkersFile property in the httpd.conf) there is a list of workers mapped to a list of their respective hosts to forward to. I.e. worker.list=foo, bar, baz worker.foo.port=16013 worker.foo.host=foo_host worker.foo.type=ajp13 worker.bar.port=16013 worker.bar.host=bar_host worker.bar.type=ajp13 worker.baz.port=16013 worker.baz.host=baz_host worker.baz.type=ajp13 If one of the hosts which is specified cannot be resolved (lets say in this case it is baz_host), then the following statement is printed in the mod_jk.log file ajp_validate::jk_ajp_common.c (2001): worker baz contact is 'baz_host:23013' ajp_validate::jk_ajp_common.c (2010): can't resolve tomcat address baz_host ajp_validate::jk_ajp_common.c (2013): invalid host and port baz_host 23013 ajp_destroy::jk_ajp_common.c (2215): up to 0 endpoints to close wc_create_worker::jk_worker.c (161): validate failed for baz build_worker_map::jk_worker.c (259): failed to create worker baz close_workers::jk_worker.c (215): close_workers will destroy worker foo ajp_destroy::jk_ajp_common.c (2215): up to 1 endpoints to close close_workers::jk_worker.c (215): close_workers will destroy worker bar ajp_destroy::jk_ajp_common.c (2215): up to 1 endpoints to close Which means that all workers are now destroyed instead of just that single worker (which seems a bit overkill).
Comment 1 Jerrold Poh 2007-07-10 17:14:54 UTC
Forgot to add, version of mod_jk is 1.2.23
Comment 2 Rainer Jung 2007-07-12 07:22:34 UTC
This behaviour is intended. If a name of a target worker is not resolvable during startup, we prefer not to startup at all. So the destruction of the workers is just a side effect of aborting the Apache startup. If we could resolve the names once (and finish startup), we'll nor resolve again during normal operations. In our experience logging a failed resolution for a worker during startup and continue stratup in many cases will make administrators unaware of the problem.
Comment 3 Tim Whittington 2007-07-15 02:26:27 UTC
I think the point here is that startup isn't aborted (the log excerpt doesn't show this). What happens is that after the workers are shut down, Apache startup continues, leaving mod_jk in an inconsistent state. A request for a URI that should map to a worker then enters mod_jk, goes through the URI -> Worker mapping process, and then fails to find the worker identified. So what is happening is precisely the situation that we're trying to avoid - starting Apache with a broken configuration. This is with Apache 2.0.52 on CentOS 4, so it's a pretty standard setup.
Comment 4 Mladen Turk 2007-07-15 02:57:30 UTC
Yes, we should treat that the same way as misconfiguration or entering invalid directives, because it actually is and refusing to load the mod_jk in that case. BTW the Httpd itself won't start if you provide an invalid IP address or a port already occupied for example, so disabling mod_jk or even entire httpd is legitimate thing to do.
Comment 5 Mladen Turk 2007-07-15 03:12:25 UTC
After a second thought, mod_jk still needs to load. The error.log entry and probably console output should be enough. It should behave in the same way as for example configuring the wrong path wor the Directory or WirtualHost root. Log that and continue. Httpd will return 404 in that resource is requested. So I think we are fine with what we have right now. The worker(entire balancer) is disabled as well as his mappings.
Comment 6 Rainer Jung 2007-07-16 06:28:07 UTC
Which was the web server, where the behaviour got observed? Actually we implemented both variants: Apache 2.x: any validation failure will make wc_open() return JK_FALSE and Apache will log an error and *not* start. Apache 1.3: only logs an error, but does start up. Worker initialization even for the good ones might not be done! IIS: Not sure. Netscape: Codes looks like we only log a line, so seems to be the same as for Apache 1.3 I would prefer to not start up. It is a problem, that can be detected during startup. As such ir differs from typos in context URLs. Since we can detect the problem during startup, we should tell people what's wrong and not do the startup. I think, that Apache 2.x does it OK, we don't have it in Apache 1.3 primarily for historic reasons, because the init hook does not allow return values. We can use out jk_error_exit() function nevertheless. I'll patch it for Apache 1.3 if noone objects. Mladen? Anyone else?
Comment 7 Mladen Turk 2007-07-16 10:20:46 UTC
No objections. Not sure for IIS if we should or could force the entire service shutdown. However we can retun 404 from filter if init failed.
Comment 8 Rainer Jung 2007-07-17 05:59:36 UTC
Apache 1.3: fixed in r556916 (don't startup if mod_jk has an initialization error) IIS: fixed in r556836 (return HTTP status 500 if mod_jk has an initialization error) Checked for Netscape and Apache httpd 2.x: Both already do not startup in case of an initialization error. Changes will be released as part of 1.2.24.
Comment 9 Rainer Jung 2008-01-01 16:32:29 UTC
Move a couple of fixed JK issues from resolved to closed.