Bug 53748 - timer for the access log sampler
Summary: timer for the access log sampler
Status: NEEDINFO
Alias: None
Product: JMeter
Classification: Unclassified
Component: HTTP (show other bugs)
Version: 2.7
Hardware: All All
: P4 enhancement (vote)
Target Milestone: ---
Assignee: JMeter issues mailing list
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2012-08-20 20:39 UTC by ejaenv
Modified: 2019-05-15 13:13 UTC (History)
1 user (show)



Attachments
contains the code of access log timer, plan generator, and tests (153.79 KB, patch)
2012-08-20 20:39 UTC, ejaenv
Details | Diff
design details and screenshots (498.51 KB, application/octet-stream)
2012-08-21 02:45 UTC, ejaenv
Details
writtable documentation (550.85 KB, application/vnd.oasis.opendocument.presentation)
2013-06-15 13:54 UTC, ejaenv
Details
Updated patch (168.77 KB, patch)
2013-06-15 20:57 UTC, Philippe Mouawad
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description ejaenv 2012-08-20 20:39:34 UTC
Created attachment 29255 [details]
contains the code of access log timer, plan generator, and tests

The actual access log sampler has two drawbacks: a) the rate of the requests doens't replay the access log as the sampler doesn't considers the existing delay betweeen requests, and b) the concurrency is neither real, as there isn't distintion of which IP sends the request.

This enhancement adds two components: a timer for the access log sampler, and a generator that creates a test plan for the access log, with one thread group 
for each IP.

With these components, replaying an access log is now much more real, as samples are sent respecting the delays in the access log, and each IP is represented by a different thread group so the real access log conccurrency is replayed.

With the generator you can also generate a test plan for a specific time range of the access log being replayed, in case  you don't want to replay the whole access log.

I attach here the patch, and in the dev mailing list I will attach a PDF with the design details and some screenshots.
Comment 1 ejaenv 2012-08-21 02:45:14 UTC
Created attachment 29256 [details]
design details and screenshots

I attach here the pdf
Comment 2 Philippe Mouawad 2013-05-05 14:43:39 UTC
Hello,
Thanks for your contribution and sorry for very late feedback.

Few questions regarding Access Log Timer:
- Why read lines from access log into memory instead of reading them from file ?
- There is something that's not clear for me, does Access Log TImeer suppose that access log file has been split into as many file as IPs and that  only the logs concerning the Thread Group under which Timer is located are passed to it ? because if not, it seems to me delay computation is weird ? could you clarify ?
- Could you submit the PDF into a writable format ?


Thanks
Comment 3 ejaenv 2013-05-08 06:40:11 UTC
Hi Philippe! I'll answer in a few days. Cheers
Comment 4 ejaenv 2013-05-08 06:41:27 UTC
Hi Philippe! I'll answer in a few days. Cheers
Comment 5 ejaenv 2013-06-15 13:54:38 UTC
Created attachment 30434 [details]
writtable documentation
Comment 6 ejaenv 2013-06-15 13:57:33 UTC
Hi Philippe,

Sorry me too! 

Thanks for your questions. I hope to clarify them:

- Why read lines from access log into memory instead of reading them from file ?

This is needed to keep at runtime the delays as similar as in the log file. Reading them from disc caused delay deviations due to disc major latency.

- There is something that's not clear for me, does Access Log TImeer suppose that access log file has been split into as many file as IPs and that  only the logs concerning the Thread Group under which Timer is located are passed to it ? because if not, it seems to me delay computation is weird ? could you clarify ?

yes, the generator (1) splits the log file in the way you say, and also generates the jmeter plan file. Keep in mind that this computation is done before running jmeter, in a preparation phase.

(1) the generator is implemented actually as a junit test, but the idea is to have a command tool.

- Could you submit the PDF into a writable format ?

I attach it.


Best regards,
/Enric
Comment 7 Philippe Mouawad 2013-06-15 20:30:44 UTC
(In reply to ejaenv from comment #6)
> Hi Philippe,
> 
> Sorry me too! 
> 
> Thanks for your questions. I hope to clarify them:
> 
> - Why read lines from access log into memory instead of reading them from
> file ?
> 
> This is needed to keep at runtime the delays as similar as in the log file.
> Reading them from disc caused delay deviations due to disc major latency.
> 
In this case , I think it's better to parse time during read and only store the time instead of the whole sampler ?
Also I wonder if it would not be better to read from file to decrease memory usage.

> - There is something that's not clear for me, does Access Log TImeer suppose
> that access log file has been split into as many file as IPs and that  only
> the logs concerning the Thread Group under which Timer is located are passed
> to it ? because if not, it seems to me delay computation is weird ? could
> you clarify ?
> 
> yes, the generator (1) splits the log file in the way you say, and also
> generates the jmeter plan file. Keep in mind that this computation is done
> before running jmeter, in a preparation phase.
> 
> (1) the generator is implemented actually as a junit test, but the idea is
> to have a command tool.
> 
> - Could you submit the PDF into a writable format ?
> 
> I attach it.
> 
> 
> Best regards,
> /Enric


Could you join the dev mailing list, I started a discussion "AccessLogSampler & Bug 53748" ?
Comment 8 Philippe Mouawad 2013-06-15 20:57:16 UTC
Created attachment 30440 [details]
Updated patch

Attached is the updated patch with the following:
- Some code cleanup to follow naming conventions and others
- AccessLogTimer now only reads the time in memory instead of samples content
Comment 9 Sebb 2013-07-04 12:39:42 UTC
I'm not sure why the access log needs to be rewritten, surely the sampler can just read through the file until it finds the correct entry? Each sampler would need to know the IP address for which it is responsible.

If the access log does need to be reformatted, it should be done as a separate stage before starting the test proper, and the output should probably be reformatted to make subsequent parsing easier.

I don't think this should be added to JMeter trunk without further analysis.
Maybe create an SVN branch so the feature can be tested further.
If it looks OK it can later be merged with trunk.
Comment 10 ejaenv 2013-07-04 13:38:45 UTC
Yes this is a beta version and surely the design needs to be discussed. As it is now:

>If the access log does need to be reformatted, it should be done as a separate >stage before starting the test proper, and the output should probably be >reformatted to make subsequent parsing easier.

Yes, this is done in a separate stage.

>I'm not sure why the access log needs to be rewritten, surely the sampler can >just read through the file until it finds the correct entry? Each sampler 
> would need to know the IP address for which it is responsible.

As in the current plugin design, a log preprocessing is done:

1. to know in advance how many threadgroups will have the test plan (one for IP)

2. to know how many requests each IP will send, and close the threadgroup when it has finished. Rewritting the acceslog is done for efficiency, to avoid each sampler read the whole log file, but apart of this, yes, it's possible for a sampler to work directly with the original file. 


>I don't think this should be added to JMeter trunk without further analysis.
>Maybe create an SVN branch so the feature can be tested further.
>If it looks OK it can later be merged with trunk.

I agree! Probably its better a design without preprocessing, where a master sampler reads the log, and dynamically creates slave threadgroups on demand. This would also solve the file hander problem you pointed earlier.
Comment 11 Sebb 2013-07-04 15:14:06 UTC
Separate Thread Groups are not essential to ensure concurrency; separate threads are sufficient.

Also it's not necessary to know how in advance many samples there are in each batch; a test element can cause its thread to stop when it has no more data to process.

I think the only advance knowledge that may be needed is how many different IPs there are, so sufficient threads can be started.
Comment 12 ejaenv 2013-07-04 16:13:51 UTC
but knowing the number of IPs is not necessary as threads can be created in demand whenever the sampler reads a new IP.

what It's needed is the number of requests per IP so a thread can finalize once it has sent all its requests.

The preprocessing could be done by the sampler itself before it starts creating threads.
Comment 13 Sebb 2013-07-04 16:31:46 UTC
(In reply to ejaenv from comment #12)
> but knowing the number of IPs is not necessary as threads can be created in
> demand whenever the sampler reads a new IP.

Thread creation is relatively expensive, so not ideal to create one as part of the main test.
 
> what It's needed is the number of requests per IP so a thread can finalize
> once it has sent all its requests.

A thread can stop itself when it has nothing more to do.
 
> The preprocessing could be done by the sampler itself before it starts
> creating threads.

Samplers don't normally create threads in JMeter (except for download resources).
Comment 14 ejaenv 2013-07-04 19:21:39 UTC
so do you propose to create and start all the threads at the very test start?

how can you finalize the threads without knowing the number of requests to send? are you proposing that all the threads live the whole test?
Comment 15 Sebb 2013-07-04 21:25:38 UTC
(In reply to ejaenv from comment #14)
> so do you propose to create and start all the threads at the very test start?

Ideally, yes.

> how can you finalize the threads without knowing the number of requests to
> send? 

As already stated (twice), a thread can stop itself when it has no more samples to process.

> are you proposing that all the threads live the whole test?

No.
Comment 16 ejaenv 2013-07-05 05:04:49 UTC
>> how can you finalize the threads without knowing the number of requests to
>> send? 

>As already stated (twice), a thread can stop itself when it has no more samples >to process.

How can the thread stop itself? I am assuming that threads (samplers) don't read themselves the log to avoid the file handler problem, but that there is a queue that feed the samplers (as you pointed). Someone has to notify them to stop, or tell them h
Comment 17 Sebb 2013-07-05 08:29:05 UTC
(In reply to ejaenv from comment #16)
>
> How can the thread stop itself? I am assuming that threads (samplers) don't
> read themselves the log to avoid the file handler problem, but that there is
> a queue that feed the samplers (as you pointed). Someone has to notify them
> to stop, or tell them h

For example, the queue can contain a special marker entry that denotes EOF.
This technique is used in AsynchSampleSender - see FINAL_EVENT.
Comment 18 ejaenv 2013-07-05 09:24:08 UTC
sorry I don't follow you.. Supose a 1-hour log where an IP just sends 1 request at time-0.  What do you mean, the thread should stop 1 hour later (at EOF) of after sending the request? (I am fine with any option, I just propose in the second case that before start sending requests someone read the whole log and configure the thread to stop after 1 request in this case)