Bug 48350

Summary: Deadlock on distributed testing with 2 clients
Product: JMeter - Now in Github Reporter: Philippe Mouawad <p.mouawad>
Component: MainAssignee: JMeter issues mailing list <issues>
Status: RESOLVED FIXED    
Severity: normal CC: p.mouawad
Priority: P2    
Version: Nightly (Please specify date)   
Target Milestone: ---   
Hardware: PC   
OS: Mac OS X 10.4   
Attachments: Thread Dump of Deadlock
Thread Dump of Deadlocked Threads
Test scenario
Patch to deadlock
Cleaned up Test
Deadlock Scenario
Path that solves issue with -G option
Simpler test case - does not need Tomcat
Patch to deadlock

Description Philippe Mouawad 2009-12-08 11:35:34 UTC
Created attachment 24680 [details]
Thread Dump of Deadlock

Hello,
I encountered this bug on trunk version of JMeter.
A deadlock occurs and nothing happens (no output file generated) on remote testing.

I have a Controller that contacts 2 Windows JMeter Servers to run distributed test.
Attached is stacktrace.

Version 2.4.20091009 

Philippe
http://www.ubik-ingenierie.com
Comment 1 Philippe Mouawad 2010-03-06 12:50:45 UTC
Created attachment 25091 [details]
Thread Dump of Deadlocked Threads

Sorry for my previous attachment, it didn't show deadlock.
Here is the good one that shows Deadlock in the controller when serializing HashTree.

Philippe
http://www.ubik-ingenierie.com
Comment 2 Philippe Mouawad 2010-03-06 12:51:32 UTC
Created attachment 25092 [details]
Test scenario

This scenario illustrates the bug.
Comment 3 Philippe Mouawad 2010-03-06 12:54:20 UTC
To reproduce deadlock do the following:
- Start 2 servers on same machine or 2 different machines
- Start controller


Deadlock occurs on a very high frequence (4 times on 5 runs).
I reprocuded it on:
-Mac OSX (controller and servers on same machine)
-Linux (controller and servers on same machine)
-Window / Mac (controller on Mac, Servers on Windows)


Philippe
http://www.ubik-ingenierie.com
Comment 4 Philippe Mouawad 2010-03-06 13:25:06 UTC
Created attachment 25093 [details]
Patch to deadlock

Patch that avoids deadlock by synchronizing only configuration send to each remote server.
Should have a very low performance impact.
Tested successfully.

Philippe Mouawad
http://www.ubik-ingenierie.com
Comment 5 Sebb 2010-03-06 15:17:59 UTC
Thanks very much for the thread dump and patch.
It's not obvious why the deadlock occurs, because each remote host has its own ClientJMeterEngine.

Do you have a simpler test case that shows the problem?

Also, do you use the -G options at all?

Seems to me that this could perhaps also cause a deadlock, as the ClientJMeterEngine code for this is fairly similar.

Perhaps you could try a test with your patch plus a dummy -G option, e.g. just add

-Gdummy=1

to the client startup.

If this also causes deadlocks, then the remote.setProperties() call probably needs to be moved into the synch. block as well.

Could you also please try:

remote.configure(testTree, host);
instead of
remote.configure(test, host);

without the synch. block?

Might also be worth trying synchronising on (testTree) with the above change.
Comment 6 Sebb 2010-03-06 15:19:04 UTC
By the way, which version of JMeter are you using?

Also, which OS?

These are listed at the start of jmeter.log.
Comment 7 Philippe Mouawad 2010-03-06 15:32:15 UTC
Issue occurs on Trunk version of JMeter on which I applied the patch.
It also occured on 2.4.

Concerning OS I thought I had answered it, I tested on:
- MacOS X as Controller , 1 server on Windows XP, 1 server on Windows 2000
- Linux Redhat as Controller, Linux as 2 Servers
- MacOSx as Controller,  2 servers on Mac OSX


Concerning the test case you can use Tomcat 6 as server to run IT, I created specially to submint the bug so you can reproduce.
I think bug occurs when scenario is complex as the one I create.

No -G option at all.

The deadlock is only on master controller which sends it config to 2 others servers.
This send is done by 2 threads, one for each remote server as I understand.

I will try to do what you are asking for as soon as possible.

Cordially.
Philippe
http://www.ubik-ingenierie.com
Comment 8 Philippe Mouawad 2010-03-06 16:07:37 UTC
Created attachment 25095 [details]
Cleaned up Test
Comment 9 Philippe Mouawad 2010-03-06 16:14:31 UTC
Hello,
I tested:
remote.configure(testTree, host);
instead of
remote.configure(test, host);

Same behaviour, because testTree is the same reference as test so same locked object.

Philippe.
Comment 10 Philippe Mouawad 2010-03-06 16:15:11 UTC
Forget my previous comment
Comment 11 Philippe Mouawad 2010-03-06 16:21:03 UTC
Created attachment 25096 [details]
Deadlock Scenario
Comment 12 Philippe Mouawad 2010-03-06 16:25:51 UTC
Hello Sebb,
I just tested removing synchronized with 
remote.configure(testTree, host);
instead of
remote.configure(test, host);


Deadlock does not occur,my understanding  of your modification is that you cleanup a bit HashTree from Swing elements and other ones useless for test.
But changes are a bit more important than my hack, maybe more tests are required than my little ones.
I will try on my real life scenario.

Philippe
http://www.Ubik-ingenierie.com
Comment 13 Philippe Mouawad 2010-03-06 16:35:20 UTC
Created attachment 25097 [details]
Path that solves issue with -G option
Comment 14 Philippe Mouawad 2010-03-06 16:37:01 UTC
Hello Sebb,
I tested with -Gdummy=1 and deadlock occurs with your change and mine.
So I applied your change:
            remote.configure(testTree, host);				

And added the following change on properties:
                	Properties clonedProperties = null;
                	synchronized (LOCK) {
                		 clonedProperties = (Properties)savep.clone();	
					}
                    remote.setProperties(clonedProperties);



I cloned the saveP so each thread has its copy.

I tested and everything is OK

Cordially
Philippe Mouawad
http://www.ubik-ingenierie.com
Comment 15 Sebb 2010-03-06 16:56:48 UTC
I've been able to reproduce the problem with a slightly trimmed version of the test plan.

Using 

remote.configure(testTree, host);
instead of
remote.configure(test, host);

does not stop the deadlocks.

Nor does synchronizing on testTree, however using a static lock object does work for me.

The -G option does not seem to cause deadlocks.

I'm not yet sure why there can be a deadlock; the code seems to create a separate copy of the test tree for each client. Perhaps one of the nested data structures is not being copied. If so, then fixing that should prevent the deadlock.

However this may be very tricky to find; in the meantime your original suggested patch would prevent the deadlocks.

I don't think it's necessary to synch. for the -G option.
Comment 16 Sebb 2010-03-06 18:58:54 UTC
Created attachment 25099 [details]
Simpler test case - does not need Tomcat
Comment 17 Philippe Mouawad 2010-03-06 19:10:48 UTC
Created attachment 25100 [details]
Patch to deadlock

Here is a patch that:
- Synchronizes configuration sending to avoid deadlock
- clones properties to avoid locking issues (remove synchronization block since Properties object is only read)


You are right, there are 2 different references of test HashTree but they share somewhere the same object.


Philippe
http://wwww.ubik-ingenierie.com
Comment 18 Philippe Mouawad 2010-03-06 19:22:20 UTC
(In reply to comment #15)
> I've been able to reproduce the problem with a slightly trimmed version of the
> test plan.
> 
> Using 
> 
> remote.configure(testTree, host);
> instead of
> remote.configure(test, host);
> 
> does not stop the deadlocks.
> 
> Nor does synchronizing on testTree, however using a static lock object does
> work for me.
> 
> The -G option does not seem to cause deadlocks.
> 
> I'm not yet sure why there can be a deadlock; the code seems to create a
> separate copy of the test tree for each client. Perhaps one of the nested data
> structures is not being copied. If so, then fixing that should prevent the
> deadlock.
> 
> However this may be very tricky to find; in the meantime your original
> suggested patch would prevent the deadlocks.
> 
> I don't think it's necessary to synch. for the -G option.

Hello,
From my investigations:
- synchronizing on testTree doesn't do it because there are 2 different instances of test
- and concerning this fix, it doesn't do it because due to HashTree testTree = test, test and testTree are the same object:
> remote.configure(testTree, host);
> instead of
> remote.configure(test, host);

Philippe.
Comment 19 Sebb 2010-03-06 21:39:28 UTC
Applied patch:

URL: http://svn.apache.org/viewvc?rev=919850&view=rev
Log:
Bug 48350 - Deadlock on distributed testing with 2 clients

Modified:
   jakarta/jmeter/trunk/src/core/org/apache/jmeter/engine/ClientJMeterEngine.java
   jakarta/jmeter/trunk/xdocs/changes.xml

As far as I can make out, there's no need to clone the properties.
Comment 20 Sebb 2010-03-06 21:39:58 UTC
Thanks again for the report and fix.
Comment 21 The ASF infrastructure team 2022-09-24 20:37:44 UTC
This issue has been migrated to GitHub: https://github.com/apache/jmeter/issues/2319