Bug 36755 - Characters in cyrillic need to be saved under unicode
Characters in cyrillic need to be saved under unicode
Status: RESOLVED FIXED
Product: JMeter
Classification: Unclassified
Component: Main
2.1
PC Windows XP
: P2 major (vote)
: ---
Assigned To: JMeter issues mailing list
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2005-09-21 09:24 UTC by Horen Kirazyan
Modified: 2007-04-30 15:14 UTC (History)
0 users



Attachments
Suggested patch to load and save files in UTF-8 encoding (15.67 KB, patch)
2007-04-04 02:51 UTC, Alf Hogemark
Details | Diff
Consistent and improved closing of file streams (13.05 KB, patch)
2007-04-30 06:09 UTC, Alf Hogemark
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Horen Kirazyan 2005-09-21 09:24:38 UTC
I have the following problem - the values of most of the parameters i send with
the http request are cyrillic. Is there a way so that they to be saved under
Unicode in the test suite and when runned the suite use the unicode encoding for
them, because when they are not unicode the system been tested doesn't recognies
them and the one workaround is to paste the strings everytime i run the test
from an independant program which codes them to unicode and i paste the unicode
text for example "Подпиши и изпрати" in the value...
Comment 1 Alf Hogemark 2007-02-27 04:30:23 UTC
The bug report does not tell wether the problem is happening with http get
request or http post request.

If I use unicode characters as value for parameter used in a http get request,
and check on the "encode" checkbox for that parameter, things seems to work fine
for me.

If you want to use unicode characters when posting, then I think you need to
wait until the bug 41705 is fixed. And if you want your testdata in a csv file,
you need to wait for 41704.

I tested with utf8 samples from http://www.columbia.edu/kermit/utf8.html
Comment 2 Alf Hogemark 2007-04-04 02:51:25 UTC
Created attachment 19912 [details]
Suggested patch to load and save files in UTF-8 encoding

The problem is happening when saving the test plan. Currently, JMeter is using
default Java JRE file encoding to load and save files.

JMeter should rather use UTF-8, or other encoding specified in
saveservice.properties file.

This suggested patch add methods in SaveService, to load and save using
Input/OutputStream instead of InputReader/OutputWriter. Then SaveService takes
care of instantiating the InputReader and OutputWriter using the specified file
encoding, which by default is UTF-8 in the saveservice.properties file.

The xml header "<?xml version="1.0" encoding="UTF-8"?>" is written at the start
of the files. Note that the xstream parser ignores this, see
http://xstream.codehaus.org/faq.html#XML. But it is useful if you open the
files in other editors / browsers.

Note that I am closing the InputReader / OutputWriter in the SaveService
methods , this was not done previously. I have a habit of always closing things
I open.

I've looked at the places in the code where SaveService is used, and in about
half of the cases the stream sent in to SaveService is not closed by the
calling  code. I think I will go through the different classes that uses
SaveService, and make sure streams are closed in the same manner in all places.
But that will be another patch.

I haven't changed anything in org.apache.jmeter.reporters.ResultCollector, that
class is still using methods which I have marked as deprecated in SaveService.
I will wait and see if this patch suits you, and if it does, I might look into
making the ResultsCollector use more code from SaveService, and also not use
the deprecated methods. But it seems a bit complex to change the
ResultsCollector.
The ResultsCollector is currently hardcoded to using UTF-8.

Note that a lot of unit tests in TestSaveService fails, because the "xml
header" is added to the start of the files, which means that the file lenght
grows a little bit. If this patch is accepted, the testfiles should be changed
to contain the "xml header", and then the unit test will pass.

With this patch, Jmeter saves files as UTF-8, and it means that I can use UTF-8
as parameter values, for example copy / paste some Sanskrit characters from
http://www.columbia.edu/kermit/utf8.html
So I think this patch solves this bug.
Comment 3 Alf Hogemark 2007-04-22 08:38:58 UTC
(In reply to comment #2)

Have you had time to look at the patch ?
I'm wondering if perhaps the "xml header" is added a few places where it
shouldn't be added.

I think this is one of the last remaining issues before one could claim that
HTTP testing using "any" encoding is close to perfect. So it would be good to
get this bug solved before next release of JMeter. With your help, I think we
should be able to solve it.
Comment 4 Horen Kirazyan 2007-04-22 10:09:13 UTC
At this time i couldn't look at it, cause i have plenty of work and no time :(
hope soon i'll get a little bit free and try it :)
Comment 5 Sebb 2007-04-24 13:58:17 UTC
I've added the code to SVN in r532077; it will be in any nightly builds later 
than that.

Please report if this solves the original problem or not?
Thanks.
Comment 6 Alf Hogemark 2007-04-30 06:09:50 UTC
Created attachment 20073 [details]
Consistent and improved closing of file streams

I have tested the code in svn r532077, and I am now able to put cyrillic
characters into the test plan, save the test plan, quite jmeter, start jmeter,
load test plan, and the characters looks exactly like they were when I saved
the test plan.

So I think this bug can be marked as "fixed".

I am now attaching a patch, which makes the closing of file streams more
consistent in JMeter, and also makes sure streams are closed in some cases
where it was not being closed.
Comment 7 Sebb 2007-04-30 15:14:31 UTC
Thanks for the streams patch - but in future it would be better to open a new 
Bugzilla issue for a new problem.