Bug 64051

Summary: mod_jk set_session_cookie not sending new cookie after node failover for sticky session
Product: Tomcat Connectors Reporter: Mohsen <dotin.insurance>
Component: mod_jkAssignee: Tomcat Developers Mailing List <dev>
Status: RESOLVED FIXED    
Severity: normal CC: dotin.insurance
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Proposed patch

Description Mohsen 2020-01-05 11:27:03 UTC
Hi

Sorry I couldn't find mod_jk in components.

I have set up set_session_cookie and add a custom cookie for sticky session management. It's working fine until I stop a specific node. The balancer would redirect me to new node but it won't set new node name to the cookie.

This is what documentation says:

"Especially after a node failover we will send a new cookie to switch stickyness to the new node."

But it is not working.

This is mod_jk log trace:

[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [debug] service::jk_lb_worker.c (1270): service sticky_session=1 id='.node1'
[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [trace] get_most_suitable_worker::jk_lb_worker.c (1037): enter
[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [debug] get_most_suitable_worker::jk_lb_worker.c (1078): searching worker for partial sessionid .node1
[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [debug] get_most_suitable_worker::jk_lb_worker.c (1086): searching worker for session route node1
[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [debug] get_most_suitable_worker::jk_lb_worker.c (1136): found best worker node2 (node2) using method 'Busyness'
[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [trace] get_most_suitable_worker::jk_lb_worker.c (1139): exit
[Sun Jan 05 14:02:35.126 2020] [25699:140105601013504] [debug] service::jk_lb_worker.c (1315): service worker=node2 route=node2 failover=false



I red source of file "jk_lb_worker.c" and I think the s->sticky is setting wrong value so when the node fails over the new cookie is not sending to client.

Hope to hear from you soon.

Sincerely,

Mohsen
Comment 1 Christophe JAILLET 2020-01-05 13:03:23 UTC
mod_jk is not part of httpd itself.
Re-affecting it to "Tomcat Connectors" -> "mod_jk"
Comment 2 Mohsen 2020-01-05 13:11:40 UTC
Sorry, my bad. I would repost it there.
Comment 3 Mohsen 2020-01-05 13:21:44 UTC
*** Bug 64052 has been marked as a duplicate of this bug. ***
Comment 4 Mohsen 2020-01-07 06:26:44 UTC
Any help?
Comment 5 Mohsen 2020-01-11 09:04:28 UTC
Please Help me with this problem.
Comment 6 Christopher Schultz 2020-01-12 17:46:03 UTC
Please post your configuration for both mod_jk and Tomcat. All we need is the Jk* directives in httpd.conf (or similar), your worker configuration from workers.properties (if not configured via Jk* directives) and your <Connector> in conf/server.xml from Tomcat.

Please remember to remove any secrets from those configurations before posting.
Comment 7 Mohsen 2020-01-13 06:46:17 UTC
Thank you for reply.Yesterday I figure out something. When I stop worker the new cookie is not sending with request, but when I remove the worker from workers, (Also at least 2 workers should exists) then the cookie would send with request.

This is my workers.properties:

workers.tomcat_home=$CATALINA_HOME
workers.java_home=$JAVA_HOME

worker.list=status,balancer


worker.node1.port=8009
worker.node1.host=192.168.1.2
worker.node1.type=ajp13
worker.node1.lbfactor=1
worker.node1.sticky_session=1

worker.node2.port=8009
worker.node2.host=192.168.1.3
worker.node2.type=ajp13
worker.node2.lbfactor=1
worker.node2.sticky_session=1

worker.dummy.activation=S

worker.balancer.type=lb
worker.balancer.balance_workers=node1,node2
worker.balancer.sticky_session=1
worker.balancer.method=B
worker.balancer.session_cookie=AWN
worker.balancer.set_session_cookie=true 
worker.balancer.session_cookie_path=/myapp/

worker.status.type=status



I added dummy worker because at least should exists 2 workers to get new session cookie when I remove node1 or node2. So when I remove node1 the config would change to this:
worker.dummy.activation=A
worker.balancer.balance_workers=dummy,node2

I wrote some bash scripts to do all of these steps automatically. So when I want to update node1 I remove it from workers, add stopped dummy to workers and update the node1 and do the same for ndoe2 and after all updated I remove dummy from workers and everything goes well.

This is my server.xml in tomcat also:

<Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol"
                maxThreads="750" scheme="https" secure="true" SSLEnabled="true"
                                        maxPostSize="4194304" SSLProtocol="TLSv1.2" SSLCertificateKeyFile="conf/server.key" SSLCertificateFile="conf/server.cert"
                                                                compression="on" compressionMinSize="2048" noCompressionUserAgents="gozilla, traviata" compressableMimeType="text/html,text/xml,text/css,text/javascript,application/x-javascript,application/javascript">
                                                                        </Connector>


<Engine name="Catalina" defaultHost="localhost" jvmRoute="node1">
Comment 8 Mohsen 2020-01-13 06:49:56 UTC
Sorry I made mistake. the dummy node would always stopped:

worker.dummy.activation=S
worker.balancer.balance_workers=dummy,node2
Comment 9 Mohsen 2020-01-13 06:56:06 UTC
This is also mod_jk module config:

<IfModule jk_module>
 JkWorkersFile "conf/workers.properties"
 JkLogFile "logs/jk_module.log"
 JkLogLevel info
 JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "
 JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories
 JkRequestLogFormat "%w %R %V %U %T "
</IfModule>
Comment 10 Christopher Schultz 2020-01-14 14:09:38 UTC
(In reply to Mohsen from comment #7)

> worker.node1.sticky_session=1

This is not a valid configuration directive for a node. Only for a load-balancer worker. It's not harmful to have it here, though.

> worker.balancer.sticky_session=1

Note that sticky_session=true is the default. I'm not sure what happens if you use the value "1" instead of "true". Consider changing it to "true" to match the documentation.

> worker.balancer.method=B
> worker.balancer.session_cookie=AWN
> worker.balancer.set_session_cookie=true 
> worker.balancer.session_cookie_path=/myapp/

Are you sure you need set_session_cookie=true and the associated directives? Is "/myapp/" your actual application's context-path, or is that just removeing private information from your config?

> I added dummy worker because at least should exists 2 workers to get new
> session cookie when I remove node1 or node2. So when I remove node1 the
> config would change to this:
> worker.dummy.activation=A
> worker.balancer.balance_workers=dummy,node2

You should set the activation of "dummy" to "D" (disabled) and use the "redirect" directive from other nodes to configure a hot-standby. If you re-configure mod_jk e.g. by performing "apachectl graceful" then you may lose a lot of metrics that mod_jk has been storing about its connections. It's better to configure it to automatically fail-over.

> I wrote some bash scripts to do all of these steps automatically. So when I
> want to update node1 I remove it from workers, add stopped dummy to workers
> and update the node1 and do the same for ndoe2 and after all updated I
> remove dummy from workers and everything goes well.

Okay. This is to simulate planned downtime for a node?

> This is my server.xml in tomcat also:
> 
> <Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol"

No, it's not. You are using mod_jk, so you'll need to be using one of the AJP protocol connectors.

> <Engine name="Catalina" defaultHost="localhost" jvmRoute="node1">

Good: you have a jvmRoute set. Although with set_sticky_session=true with a separate cookie, this doesn't do anything.

Can you also please post your JkMount directives?
Comment 11 Mohsen 2020-01-18 06:51:12 UTC
Thank you for response. Yes /myapp/ is just for privacy concerns. This is my JkMount:

JKMountCopy On
JKMount /myapp balancer
JKMount /myapp/* balancer
JkMount /status status


value "1" or "true" both are correct and it's works now.

Yes as I told dummy is stopped always and that was a mistake.

Every thing is working well unless I stop a node from jk-status without restarting apachectl. So I go to /status page and from there I can stop a node. When I stop a node my requests would route to another node but the AWN cookie is not sending with response. But when I remove a worker from worker list, the AWN cookie would send with resposen(at least 2 workers should exists in workers in this case.). For reloading apachectl I just send "kill -1" signal to apache pid.
Comment 12 Christopher Schultz 2020-01-20 16:56:00 UTC
You still need to post your correct <Connector> configuration.
Comment 13 Mohsen 2020-01-21 07:38:27 UTC
This is the correct connector:

    <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />
Comment 14 Mark Thomas 2020-01-29 22:51:50 UTC
I can recreate this with a simple 3 node cluster (no status worker, no standby worker) and a JSP that shows the current node (host name) without creating a session.

The session cookie is created on the first request and then mod_jk keeps it sticky to which ever node is used. If that node is then stopped, mod_jk fails over to one of the other nodes but the session is not updated.

I have a patch for this that fixes the problem for me. It does indeed appear that s->sticky was being set to true when it should not have been. I'll attach the patch shortly.
Comment 15 Mark Thomas 2020-01-29 22:52:33 UTC
Created attachment 36988 [details]
Proposed patch

The patch is very simple. If you are able to test it and provide feedback that would be very helpful.
Comment 16 Christopher Schultz 2020-01-29 23:25:50 UTC
(In reply to Mark Thomas from comment #15)
> Proposed patch

Best. Patch. Ever.
Comment 17 Mark Thomas 2020-02-10 19:33:09 UTC
The patch worked for me in local testing so I am going to mark this as resolved.

The fix will be in 1.2.24 onwards.
Comment 18 Mark Thomas 2020-02-11 15:58:15 UTC
Correct 1.2.47 onwards.