Bug 63565 - If Controller: GC issue with JMeter during the endurance run when using with "Interpret Condition as Variable Expression?" unchecked (highlight in documentation issue)
Summary: If Controller: GC issue with JMeter during the endurance run when using with ...
Status: RESOLVED FIXED
Alias: None
Product: JMeter
Classification: Unclassified
Component: Main (show other bugs)
Version: 2.13
Hardware: PC Solaris
: P2 normal (vote)
Target Milestone: JMETER_5.2
Assignee: JMeter issues mailing list
URL:
Keywords: FixedInTrunk
Depends on:
Blocks:
 
Reported: 2019-07-16 11:17 UTC by Mohamed Ibrahim
Modified: 2019-07-28 08:19 UTC (History)
2 users (show)



Attachments
JmeterGCLog (884.20 KB, application/x-gzip)
2019-07-16 11:17 UTC, Mohamed Ibrahim
Details
GCGraph (310.40 KB, image/png)
2019-07-16 11:22 UTC, Mohamed Ibrahim
Details
jmeter log (12.35 KB, text/plain)
2019-07-16 13:17 UTC, Mohamed Ibrahim
Details
Contents (17.75 KB, text/plain)
2019-07-16 13:18 UTC, Mohamed Ibrahim
Details
Right_JmeterLog (14.53 KB, application/x-gzip)
2019-07-16 14:13 UTC, Mohamed Ibrahim
Details
JMX Schematic View (6.51 KB, text/html)
2019-07-17 08:40 UTC, Mohamed Ibrahim
Details
Histogram (35.21 KB, application/x-gzip)
2019-07-17 11:31 UTC, Mohamed Ibrahim
Details
Jmeter Log (14.47 KB, application/x-gzip)
2019-07-17 15:29 UTC, Mohamed Ibrahim
Details
JMX test plan (27.63 KB, application/xml)
2019-07-17 22:16 UTC, Mohamed Ibrahim
Details
Fix try1 (27.96 KB, application/xml)
2019-07-18 07:06 UTC, UbikLoadPack support
Details
Latest GC graph after Jexl change (230.56 KB, image/png)
2019-07-23 12:45 UTC, Mohamed Ibrahim
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mohamed Ibrahim 2019-07-16 11:17:40 UTC
Created attachment 36656 [details]
JmeterGCLog

Hi,

We have been testing couchbase DB performance with Jmeter-5.1 and we are experiencing the GC issue(YGC). we didn't hit with with a memory leak and our Tenured space is completely flat throughout the run.

Machine Spec:
 GNU/Linux x86_64 (64 bit)
 RAM ~16 GB
 Disk 60 GB
 Processor 6 Core

Jmeter Info:
 Version: 5.1
 Heap Size: 6 GB
 GC: G1 (Default) 

1. Run was conducted for 4 hours
2. 1 Load generator was used to push data to Couchbase database
3. The 95th percentile RT is 1ms.
4. Our Objects are short lived and none of it gets promoted to Tenured space. 
5. We see a increased time consumption of Young GC time during the endurance run
6. Exactly after 2.30 hours of run, The G1 reallocates the memory size within the heap. i.e, it shrinks the Young generation and Increases the Tenure space 
7. From there on we see a spike in our RT. We don't see any issue with Couchbase side but with the Jmeter on clearing the objects time. Its exponentially growing along with the run
8. We have executed one more round of test by enabling the verbose gc log. What we found is that most of the time were spent in the Ref Proc.

Attaching the screenshot as well as the gc log for your reference.
Comment 1 Mohamed Ibrahim 2019-07-16 11:22:14 UTC
Created attachment 36657 [details]
GCGraph

Adding the GC graph for the issue.
Comment 2 Philippe Mouawad 2019-07-16 12:26:42 UTC
Are you using a 3rd party plugin ?
Can you share:

- jmeter.log
- content (list as text) of lib and lib/ext folders
- how you launch your process : ps -eaf|grep java

Thank you
Comment 3 Mohamed Ibrahim 2019-07-16 13:17:56 UTC
Created attachment 36658 [details]
jmeter log
Comment 4 Mohamed Ibrahim 2019-07-16 13:18:53 UTC
Created attachment 36659 [details]
Contents

adding the long listing of contents within the lib directory
Comment 5 Mohamed Ibrahim 2019-07-16 13:20:38 UTC
Hi Philippe,

I have attached the jmeter.log and the contents list in the attachment. please find it.


and for the process,

295907272  807   773 99 03:58 pts/0    07:43:09 java -server -XX:+HeapDumpOnOutOfMemoryError -Xms6144m -Xmx6144m -Xss512k -XX:MaxMetaspaceSize=256m -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:G1ReservePercent=20 -Djava.security.egd=file:/dev/urandom -Dplugin_dependency_paths=./apache-jmeter/bin/../di/lib/ -Dsearch_paths=./apache-jmeter/bin/../di/plugins/ -jar ./apache-jmeter/bin/ApacheJMeter.jar -n -t CBStringDocument.jmx -j logs/jmeter_lg017_2019-07-16_0358.log -l Enduranceresults/Endurance_lg017_2019-07-16_0358.jtl -q Endurance.properties
Comment 6 UbikLoadPack support 2019-07-16 13:23:55 UTC
Thanks , one last thing.
Which Sampler/Listener are you using in your test ?

If you can attach it that would help
Comment 7 Philippe Mouawad 2019-07-16 13:26:23 UTC
In your report you say you're using JMeter 5.1, but in jmeter.log you're using Version 3.2 r1790748 which is very old and contains fixed memory leaks.

Please upgrade your configuration to 5.1.1 and provide last information requested by UbikLoadPack.

Thanks
Comment 8 Mohamed Ibrahim 2019-07-16 13:38:02 UTC
We are using 5.1 and I am very sure about it. I guess it's also a bug with 5.1.


And we are using Java sampler for pushing the data to the couchbase. We have internally created the plugin for pushing the data using Java SDK couchbase client(v2.7.7)



Also, We have executed the same round of test by downgrading the Jmeter to Jmeter-3.3.4 .


1. We didn't observe the memory re-allocation within the heap after 2.30 hours (out of 4 hours) and its very stable with memory and TPS as well.

2. We could still observe the GC time is exponentially growing for that version of Jmeter as well. If you want I can share that information as well
Comment 9 Philippe Mouawad 2019-07-16 13:56:08 UTC
(In reply to Mohamed Ibrahim from comment #8)
> We are using 5.1 and I am very sure about it. I guess it's also a bug with
> 5.1.

Then please provide the correct jmeter.log
The one you attached has this:
2019-05-24 05:27:56,395 INFO o.a.j.JMeter: Version 3.2 r1790748
2019-05-24 05:27:56,395 INFO o.a.j.JMeter: java.version=1.8.0_05

> 
> 
> And we are using Java sampler for pushing the data to the couchbase. We have
> internally created the plugin for pushing the data using Java SDK couchbase
> client(v2.7.7)
> 
We would need your code as most probably leak is in this part of code.
To confirm it, once you have run the test and reached the limit, run 

jmeter/bin/heapdump.sh 

and share the hprof generated
> 
> 
> Also, We have executed the same round of test by downgrading the Jmeter to
> Jmeter-3.3.4 .
> 
> 
> 1. We didn't observe the memory re-allocation within the heap after 2.30
> hours (out of 4 hours) and its very stable with memory and TPS as well.
> 
> 2. We could still observe the GC time is exponentially growing for that
> version of Jmeter as well. If you want I can share that information as well
Comment 10 Mohamed Ibrahim 2019-07-16 14:12:43 UTC
Hi Phillipe,

Sorry It's my mistake.. I apologise, I found the right log. attached it to the attachments


Sharing the code is not an issue. Just a clarification needed from your side.
1. If our sampler code is leaking, It would have kept the objects for longer time. If you have gone through the gc log, It would be very clear. There is no much objects has been promoted to tenure space.

2. There would be increase in heap memory for storing referenced objects. We didn't observe that pattern here in our graph.

3. In the Young gen space of 3.x GB, Newly created objects were completely occupying the space and its getting cleared off every-time when it reaches the threshold. 

4. The problem here is for clearing the Objects in the Young gen is kept on increasing. If you again have a look at gc log, there is a section called Other which has a section called Ref proc. That is kept on increasing all over the time during the run.


I will share you the code as well at the earliest. Thanks!
Comment 11 Mohamed Ibrahim 2019-07-16 14:13:34 UTC
Created attachment 36660 [details]
Right_JmeterLog
Comment 12 Felix Schumacher 2019-07-16 14:19:55 UTC
As hprof files tend to be rather large and might contain confidential information, you could opt for a class histogram for first diagnosis.

You can use jcmd to optain such a histogram. See https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr006.html or google for more information.
Comment 13 Philippe Mouawad 2019-07-16 18:34:12 UTC
(In reply to Mohamed Ibrahim from comment #10)
> Hi Phillipe,
> 
> Sorry It's my mistake.. I apologise, I found the right log. attached it to
> the attachments
> 
> 
> Sharing the code is not an issue. Just a clarification needed from your side.
> 1. If our sampler code is leaking, It would have kept the objects for longer
> time. If you have gone through the gc log, It would be very clear. There is
> no much objects has been promoted to tenure space.
> 
> 2. There would be increase in heap memory for storing referenced objects. We
> didn't observe that pattern here in our graph.
> 
> 3. In the Young gen space of 3.x GB, Newly created objects were completely
> occupying the space and its getting cleared off every-time when it reaches
> the threshold. 
> 
> 4. The problem here is for clearing the Objects in the Young gen is kept on
> increasing. If you again have a look at gc log, there is a section called
> Other which has a section called Ref proc. That is kept on increasing all
> over the time during the run.
> 
> 
> I will share you the code as well at the earliest. Thanks!

I didn't go through gc logs yet, I am waiting for :
- code
- JMX plan if possible or at least run Tools > Schematic view
- hprof
Comment 14 Mohamed Ibrahim 2019-07-17 08:40:20 UTC
Created attachment 36662 [details]
JMX Schematic View

Added Schematic view of the JMX
Comment 15 Felix Schumacher 2019-07-17 09:52:53 UTC
Can you try to use jexl3 for the condition?

Use '${__jexl3(${getStatus} == "204",)}' instead of '"${getStatus}" == "204"'.

Does it make a difference?
Comment 16 Mohamed Ibrahim 2019-07-17 11:31:13 UTC
Created attachment 36663 [details]
Histogram

was not able to print live histo alone.. Tried -F option as well but seems line -F options was not supported with -histo:live
Comment 17 Mohamed Ibrahim 2019-07-17 15:29:01 UTC
Created attachment 36664 [details]
Jmeter Log
Comment 18 Felix Schumacher 2019-07-17 16:08:23 UTC
(In reply to Mohamed Ibrahim from comment #16)
> Created attachment 36663 [details]
> Histogram
> 
> was not able to print live histo alone.. Tried -F option as well but seems
> line -F options was not supported with -histo:live

There are many char and byte arrays and a lot of jdk.nashorn.internal.runtime.CompiledFunction instances. Have you had a chance to try out the jexl3 code I gave you in an earlier reply?
Comment 19 Mohamed Ibrahim 2019-07-17 22:11:45 UTC
Felix,

Didn't tried with jexl but we have done with groovy to evaluate the function. The behaviour remains the same
Comment 20 Mohamed Ibrahim 2019-07-17 22:16:23 UTC
Created attachment 36667 [details]
JMX test plan

We are not authorised to share the Java Jmx file. Instead we have created the same using JSR223 sampler. We observe the same behaviour with this test plan as well.

You need to add the couchbase Java-SDK-client-2.7.7(https://mvnrepository.com/artifact/com.couchbase.client/java-client/2.7.7) to run it.

Hope this helps !!
Comment 21 UbikLoadPack support 2019-07-18 07:06:50 UTC
Created attachment 36668 [details]
Fix try1

Hello,
Can you try the attached test plan.
If it still shows same issue, then please make a hprof file using jmap and share it.

Thank you
Comment 22 Mohamed Ibrahim 2019-07-18 11:24:27 UTC
Hello Philippe,

I don't see any changes made to the script but the IfController condition with jexl3 (${__jexl3("${getStatus}" == "204",)})

Anyways, I will execute a round of test keeping this changes and come back with the outcome.

Thanks
Mohamed Ibrahim
Comment 23 Philippe Mouawad 2019-07-18 11:29:09 UTC
(In reply to Mohamed Ibrahim from comment #22)
> Hello Philippe,
> 
> I don't see any changes made to the script but the IfController condition
> with jexl3 (${__jexl3("${getStatus}" == "204",)})
> 
> Anyways, I will execute a round of test keeping this changes and come back
> with the outcome.
> 
> Thanks
> Mohamed Ibrahim

And checkbox checked « interpret condition as variable expression « 
Please do you test with this plan and if issue occurs  generate hprof.

Thanks
Comment 24 Mohamed Ibrahim 2019-07-18 12:56:25 UTC
Sure, I will execute a run. If i see the issue, I will generate heapdump and come back
Comment 25 Philippe Mouawad 2019-07-23 07:00:49 UTC
(In reply to Mohamed Ibrahim from comment #24)
> Sure, I will execute a run. If i see the issue, I will generate heapdump and
> come back

Any update ?

Thanks
Comment 26 Mohamed Ibrahim 2019-07-23 12:44:12 UTC
Hello Felix/Phillippe,

I have performed multiple round of endurance run with the jexl change in the script.

And I dint't observe any YGC time issue and there were no drop in TPS. Attached the latest graph screenshot which shows the normal behaviour.


Thanks a lot for all your help.. 


Thanks,
Mohamed Ibrahim
Comment 27 Mohamed Ibrahim 2019-07-23 12:45:21 UTC
Created attachment 36680 [details]
Latest GC graph after Jexl change

Latest GC graph after Jexl change in script
Comment 28 Philippe Mouawad 2019-07-23 19:05:29 UTC
(In reply to Mohamed Ibrahim from comment #27)
> Created attachment 36680 [details]
> Latest GC graph after Jexl change
> 
> Latest GC graph after Jexl change in script

Thanks for feedback.

@Felix, do you think we can improve something here ?
Comment 29 Philippe Mouawad 2019-07-28 08:19:12 UTC
This is an automated email from the ASF dual-hosted git repository.

pmouawad pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/jmeter.git


The following commit(s) were added to refs/heads/master by this push:
     new 0f8a9e2  Bug 63565 - If Controller: GC issue with JMeter during the endurance run when using with "Interpret Condition as Variable Expression?" unchecked (highlight in documentation issue)
0f8a9e2 is described below

commit 0f8a9e2e8b4a58907467ca2f6a38a9be65a35fc3
Author: pmouawad <p.mouawad@ubik-ingenierie.com>
AuthorDate: Sun Jul 28 10:18:25 2019 +0200

    Bug 63565 - If Controller: GC issue with JMeter during the endurance run
    when using with "Interpret Condition as Variable Expression?" unchecked
    (highlight in documentation issue)

    This resolves Bug 63565
     https://bz.apache.org/bugzilla/show_bug.cgi?id=63565
---
 xdocs/usermanual/component_reference.xml | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)