Bug 52659

Summary: Shared memory becomes corrupted
Product: Tomcat Connectors Reporter: Alex Samad <alex.samad>
Component: isapiAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: blocker    
Priority: P1    
Version: 1.2.32   
Target Milestone: ---   
Hardware: PC   
OS: All   

Description Alex Samad 2012-02-14 01:37:29 UTC
Hi

my platform
IIS7.5 on W2008R2 sp1 (thre are 2 setup in a NLB config)
1.2.32 64 connector

Connecting to 6.2Final Jboss 

We have a client programme that opens a connection and uses keep alive to keep the connection open.  We typically have 200-300 connections open at one time.
Our application pool on IIS was setup in a web garden mode, so for use that mean there was 4 process ids for the virtual web site the plugin was connected to.

So when we had a manual recycle happen all hell broke lose, the client were getting 500's back from the server. for any page that was related to the virtual web site... it would fail at the filter entry point. looks like the connector had become corrupt and was failing everything.

So from some investigating we fix it by moving to pessimistic locking .... and changing to non overlapping recycles.

we fixed opp locking is really for single processes + multi threading not multi processes and multithreads!

I am not 100% sure why the over lapping recycle failed and I believe corrupted the shm memory but i believe it has something to do with IIS restarting the application pool and I presume with the shared memory block.

We saw that the error became exasperated when there were more clients attempting to connect.  if there was only 20-40 clients, no problem it worked... but 200 and it went haywire.

one interesting test we did was to reboot one node of the NLB cluster, and when it restared it went straight into 500 mode, because 200 client were trying to connect.


Can you some how make the option for the connection to default to pessimistic mode instead of op.. This i think might be the problem with the over lapping recycle

Alex
Comment 1 Alex Samad 2012-03-05 05:27:12 UTC
I have changed OS platform. and changed to critical...

Done more testing.

setup 

1 client -> 1 IIS 7.5 + nod_jk -> jboss server

IIS7 app pool is setup for 4 process (web garden) and overlapping recycle

we have the client making 300 - 500 connections to the IIS7.5

whilst this is going on we recycle the application pool. 

modjk becomes confused and i believe corrupts the shm area all requests get 500 error.

Whilst leaving the application on, we restart IIS7.5 so there are 500 connection attempted to be made and when IIS restarts we go straight back into 500 errors.

we have to block in the inbound connections and then start IIS and wait a few seconds and then let the connections in (allow them through the firewall)

so sign i see of this.

the ajp logs has errors for uri's not being allocation.. and then stopping (presumably after the config file has been read).

I believe we should change locking for the windows platform to pessi

also why is the extension taking request when it hasn't read the config file yet !!!
Comment 2 Alex Samad 2012-03-06 05:35:40 UTC
this is an email i sent to the ML (has some extra info, maybe a better description)


Hi

This is my prod setup

Client -> Internet -> IIS7.5/NLB (2 node cluster) with the 1.2.32 tomcat connector -> 2 node tomcat cluster.

We had done a fair bit of testing on this setup enough to move into production.  But we ran into a problem ... when IIS recycle its application pool. Luckily we had set this to manually happen after hours

So I setup a test rig to see if I could simulate the problem.  The test rig is

Client -> 1 node of IIS7.5  + 1.2.32 tomcat connector -> 2 node of development tomcat :)

The client works by creating a connection and keeping it open and making lots of requests.  

We have jerry rigged a client to make multiple of these as the error occurs when the system is under load.

So the test scenario is 
start client
login >>> this creates 1 connection 
strat stress test >>> this creates 500 connection and does a little bit of polling

My initial setup is 
4 process worker threads for the application pool
Allow overlapping recycle >>>> my understanding is this means the 4 threads are not all terminated at once, but one at a time as a new thread is restarted

I use isapi_redirect.properties file
============================
extension_uri=/jakarta/isapi_redirect.dll
log_file=C:\ThisNow\Local\dev.ThisNow.com\Logs\ajpconfisapi_redirect.log
log_level=warn
log_filesize=20M
worker_file=C:\ThisNow\Shared\dev.ThisNow.com\ajpconfig\workers.properties
worker_mount_file=C:\ThisNow\Shared\dev.ThisNow.com\ajpconfig\uriworkermap.properties
rewrite_rule_file=C:\ThisNow\Shared\dev.ThisNow.com\ajpconfig\\rewrite.properties


C:\ThisNow\Shared\dev.ThisNow.com\ajpconfig\workers.properties
====================================================
worker.maintain=60
worker.list=jbclb
worker.jbclb.type=lb
worker.jbclb.balance_workers=worker1, worker2
worker.jbclb.sticky_session=true
worker.jbclb.method=Request
worker.jbclb.lock=pessimistic

# Template for all worker threads
worker.template.type=ajp13
worker.template.port=8009
worker.template.ping_mode=A
worker.template.ping_timeout=30000
worker.template.connection_pool_size=1000
worker.template.connection_pool_minsize=100
worker.template.socket_keepalive=true
worker.template.lock=pessimistic
worker.template.max_packet_size=65536

# worker 1
worker.worker1.host=10.172.204.30
worker.worker1.route=node1
worker.worker1.reference=worker.template

# worker 1
worker.worker2.route=node2
worker.worker2.host=10.172.204.31
worker.worker2.reference=worker.template

worker.list=jkstatus
# Define a 'jkstatus' worker using status
worker.jkstatus.type=status



C:\ThisNow\Shared\dev.ThisNow.com\ajpconfig\uriworkermap.properties
========================================================= 
/dev.ThisNow.com/ThisNowLive|/*=jbclb
/dev.ThisNow.com/wls|/*=jbclb
/dev.ThisNow.com/jkmanager|/*=jkstatus
/10.172.204.33/jkmanager|/*=jkstatus
/ThisNowodevrp01/jkmanager|/*=jkstatus




C:\ThisNow\Shared\dev.ThisNow.com\ajpconfig\rewrite.properties
=================================================== 
# rewrite rules
# Convert single node 
~/jb./(.*)=/$1
/web-debts-1.0.0-nossl/=/web-debts-1.0.0/


I can see there are 500 connection to IIS and ~500 connection through to tomcat. Check with perfmon, netstat...

When I recycle the application pool. IIS starts to send 500's back to the server.... it complain about the tomcat connector returning an error errorcode 2147942401, internal server error Incorrect Function 0c80070001

What I can see from the network side of things. Is the client is trying to recreate all 500 connections at once.

What I can see in my tomcat connector logs... some times there is 4 for each process and sometimes 2 or 3, if its crashed into this 500 error loop


Examples of what I see (2 log files)
Ajp file 1
===
[Tue Mar 06 14:46:46.572 2012] [5872:5028] [error] ajp_worker_factory::jk_ajp_common.c (2997): allocating ajp worker record from shared memory 
[Tue Mar 06 14:46:46.572 2012] [5872:5028] [error] wc_create_worker::jk_worker.c (151): factory for ajp13 failed for worker2 
m shared memory 
[Tue Mar 06 14:46:46.572 2012] [5872:5028] [error] validate::jk_lb_worker.c (1615): Failed creating worker worker2 
worker2 
[Tue Mar 06 14:46:46.587 2012] [2888:5708] [error] ajp_worker_factory::jk_ajp_common.c (2997): allocating ajp worker record from shared memory 
[Tue Mar 06 14:46:46.603 2012] [5872:5028] [error] validate::jk_lb_worker.c (1663): NULL parameters 
[Tue Mar 06 14:46:46.603 2012] [5872:5028] [error] wc_create_worker::jk_worker.c (163): validate failed for jbclb 
 worker2 
[Tue Mar 06 14:46:46.619 2012] [5872:5028] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
[Tue Mar 06 14:46:46.619 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.619 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [1536:1332] [error] validate::jk_lb_worker.c (1663): NULL parameters 
d not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [1536:1332] [error] wc_create_worker::jk_worker.c (163): validate failed for jbclb 
[Tue Mar 06 14:46:46.634 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.634 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.650 2012] [5872:5028] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.650 2012] [1536:1332] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.650 2012] [1536:1332] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [1536:1332] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [1536:1332] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [1536:1332] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [2888:5708] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [1536:1332] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.697 2012] [2888:5708] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing.  

Ajp log file 2
============== 
[Tue Mar 06 14:46:46.650 2012] [5872:3624] [warn] init_jk::jk_isapi_plugin.c (2738): The optimal shared memory size can now be determined automatically. 
[Tue Mar 06 14:46:46.650 2012] [5872:3624] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] wc_create_worker::jk_worker.c (151): factory for lb failed for jbclb 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.665 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.681 2012] [5872:3624] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.697 2012] [5872:2836] [warn] init_jk::jk_isapi_plugin.c (2738): The optimal shared memory size can now be determined automatically. 
[Tue Mar 06 14:46:46.697 2012] [2888:4792] [warn] init_jk::jk_isapi_plugin.c (2738): The optimal shared memory size can now be determined automatically. 
[Tue Mar 06 14:46:46.697 2012] [5872:2836] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.712 2012] [2888:4792] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.712 2012] [2888:4792] [error] wc_create_worker::jk_worker.c (151): factory for lb failed for jbclb 
w be determined automatically. 
[Tue Mar 06 14:46:46.728 2012] [2888:4792] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
te if you want to use the optimal size. 
[Tue Mar 06 14:46:46.728 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.728 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.728 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.728 2012] [1536:1492] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
[Tue Mar 06 14:46:46.728 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.728 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.744 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [2888:4792] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.759 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
 
[Tue Mar 06 14:46:46.759 2012] [5872:2836] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.775 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.775 2012] [1536:1492] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.775 2012] [5872:3380] [warn] init_jk::jk_isapi_plugin.c (2738): The optimal shared memory size can now be determined automatically. 
[Tue Mar 06 14:46:46.790 2012] [5872:3380] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.790 2012] [1536:1272] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.790 2012] [2888:1548] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.790 2012] [1536:1272] [error] wc_create_worker::jk_worker.c (151): factory for lb failed for jbclb 
[Tue Mar 06 14:46:46.790 2012] [2888:1548] [error] wc_create_worker::jk_worker.c (151): factory for lb failed for jbclb 
[Tue Mar 06 14:46:46.806 2012] [1536:1272] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.806 2012] [2888:1548] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
[Tue Mar 06 14:46:46.806 2012] [5872:3380] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.806 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.806 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.806 2012] [5872:3380] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.806 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.806 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.822 2012] [5872:3380] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.822 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.822 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
 
[Tue Mar 06 14:46:46.822 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.822 2012] [5872:3380] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.822 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.837 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.837 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.837 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.837 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.837 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.837 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
 
[Tue Mar 06 14:46:46.853 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.853 2012] [5872:3380] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.853 2012] [1536:1272] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.853 2012] [2888:1548] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.869 2012] [2888:3536] [warn] init_jk::jk_isapi_plugin.c (2738): The optimal shared memory size can now be determined automatically. 
[Tue Mar 06 14:46:46.869 2012] [2888:3536] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.869 2012] [1536:5072] [warn] init_jk::jk_isapi_plugin.c (2738): The optimal shared memory size can now be determined automatically. 
[Tue Mar 06 14:46:46.884 2012] [1536:5072] [warn] init_jk::jk_isapi_plugin.c (2740): You can remove the shm_size attribute if you want to use the optimal size. 
[Tue Mar 06 14:46:46.884 2012] [2888:3536] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
[Tue Mar 06 14:46:46.884 2012] [1536:5072] [error] wc_create_worker::jk_worker.c (151): factory for lb failed for jbclb 
[Tue Mar 06 14:46:46.884 2012] [1536:5072] [error] build_worker_map::jk_worker.c (262): failed to create worker jbclb 
with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.884 2012] [1536:5072] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.884 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.884 2012] [1536:5072] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.900 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.900 2012] [1536:5072] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.900 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.900 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.900 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.915 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.915 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jbclb' in uri map post processing. 
[Tue Mar 06 14:46:46.915 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.915 2012] [1536:5072] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing. 
[Tue Mar 06 14:46:46.915 2012] [2888:3536] [error] uri_worker_map_ext::jk_uri_worker_map.c (506): Could not find worker with name 'jkstatus' in uri map post processing.  



During normal startup I don't see anything logged at all.

My initial testing was to move from overlapping recycle to non-overlapping recycle. This seems to have solved it for 99% of the time. But if I get enough connection trying to hit the server at 1 time, in my case > ~450 it will cause this corruption some of the times :)

My best solution is to go to single process thread and non over lapping (although I would have thought non overlapping shouldn't have an effect, but maybe that is how IIS decides to kill first or afterwards!)


So my conclusion / presumption is there is an issue with the shared memory locking and also what seems to be the initialisation code sequencing.

Under load, IIS will first start 1 process and then if enough request another and another depending on load.  So what I see is 4 process starting at about the same time and the shared memory is getting corrupted and from there it turns to poo.

I can presume it get corrupted for 2 reasons.  Firstly the connector logs faults unable to map a URI and then stop warning about that, I can presume that, the process in question hasn't reads its config and then eventual does !

If it hasn't read its config and it processing uri's and presumable touching the SHM, it is doing so under an optimistic locking method... That in windows land using in process locking methods, which don't work across processes ! (this is different to linux/unix world). 

So I am thinking 2 changes need to be made. The connector needs to default to pessimistic locking (maybe the default for windows).... It also depends on how to fix the init code.  If you can guarantee that the config file will be processed before the Shared memory is touched, then maybe its okay to leave the default to optimistic.

The second bug, the connector needs to be fully initialised before it starts to process any URI's... This is not good ... but I think it a problem with Web Gardens (this is the name given by MS for more than 1 process in the application pool)
When it was architected I can guess this wouldn't have been the setup.  I could presume (haven’t looked at the init stuff yet), but if it find the SHM memory already created, it thinks it's ready to go !


I don't mind digging around the code and supplying patches, but I am more used to the linux world and well , I have tried to build the connector in VS 2010 and .....

Does somebody have a tar  ball of a tree what I can just point to and compile :) (or maybe some instruction on how to get it to build from the svn tree in VS2010)
Comment 3 Mladen Turk 2012-03-20 04:55:20 UTC
Fixed with the 1.2.34 by ensuring that ...
1. We have a process mutex for serializing initialization
2. Parameters are updated from shared memory if shared memory is explicitly updated

Note however that recycle and isapi_redirector are conceptually wrong.
isapi_redirector is a proxy, has its own connection pool and connection pool management. You should disable IIS worker recycle in production to have optimal performance, because the entire recycle concept is meant to handle wrongly written .NET applications having memory leaks. This is not the case with isapi_redirect.

Next you should also limit the number if isapi_redirect instances per physical IIS instance and preferably have just one. You can use uriworkerap.properties to handle mapping for multiple vhosts.