In my threadgroup, I use SOAP/XML-RPC to request my server. I use regular expression extractor for generating the next request with the server response. This work fine if there is no special symbol but when there is an UTF-8 special symbol in the server response, the extraction doesn't give the String I'm waiting for. For example, if the server response contains a String like "<City>Vézelay</City>" With the regular expression extractor I would like to extract the String "Vézelay" but it give me the String "Vézelay" In hexadecimal, the special symbol is C3 00 A9 00 and it become C3 B3 C2 A9 after extraction It seems that the symbol is encoding another time in utf-8 or in another format. Note : The bug 27032 (witch is in resolved state) has some similarities.
Does the response get saved correctly if you configure a listener to "Save Response Data"? Or does this also cause the data to be mangled?
Looking at webserviceSampler, it currently gets a BufferedReader from apache soap, so it's unlikely the sampler is the problem. SOAPTransport st = msg.getSOAPTransport(); RESULT.setDataType(SampleResult.TEXT); BufferedReader br = null; // check to see if SOAPTransport is not nul and receive is // also not null. hopefully this will improve the error // reporting. 5/13/05 peter lin if (st != null && st.receive() != null) { br = st.receive(); if (this.getPropertyAsBoolean(READ_RESPONSE)) { StringBuffer buf = new StringBuffer(); String line; while ((line = br.readLine()) != null) { buf.append(line); } RESULT.sampleEnd(); // set the response RESULT.setResponseData(buf.toString().getBytes()); If apache soap doesn't create a reader using the correct encoding, it "could" cause the problem you see. I don't know apache soap well enough to say with any certainty that is the case. It could also be a limitation of the assertion. peter
(In reply to comment #1) > Does the response get saved correctly if you configure a listener to "Save > Response Data"? > > Or does this also cause the data to be mangled? Yes if I save the response data in a file the datas are not transformed. They are saved correctly.
I don't think that soap is (In reply to comment #2) > If apache soap doesn't create a reader using the correct encoding, it "could" > cause the problem you see. I don't know apache soap well enough to say with any > certainty that is the case. It could also be a limitation of the assertion. > > peter I don't think that soap doesn't use the correct with its reader encoding. Because in this case, the server response that I read in JMeter (when I write the response data in a file) should be bad encoding (just like the String I read using regular expression regulator) Maybe there is a way to spécify the String encoding when i use a regular expression extractor.
it's possible you're right and the assertion isn't handling the encoding correctly. perhaps sebb or mike will know better. jmeter uses oro-matcher, so it could be we need to set the encoding? if I have time tonight I'll take a look at oro matcher api. peter
Ok, I think I've finally found where is the bug. It's in the class RegexExtractor of the package org.apache.jmeter.extractor in the process() method there is a creation of a PatternMatcherInput whith the last response data of the last result. At this place, the response data is passed without any string encoding. To correct this bug, i've replace the line input = new PatternMatcherInput(useHeaders() ? context.getPreviousResult() .getResponseHeaders() : new String(context.getPreviousResult().getResponseData())); by try { input = new PatternMatcherInput(useHeaders() ? context.getPreviousResult() .getResponseHeaders() : new String(context.getPreviousResult().getResponseData(),context.getPreviousResult().getDataEncoding())); } catch (UnsupportedEncodingException e2) { input = new PatternMatcherInput(useHeaders() ? context.getPreviousResult() .getResponseHeaders() : new String(context.getPreviousResult().getResponseData())); } I don't know if it's THE good way to correct it, but now, the String I extract are encoded correctly. I'm not really accustomed with the way to modify the source code one a jakarta project. Can someone do this? Am I habilited to do this ?
Fixed in 2.1 branch code. Will be in 2.1.2
I am using Jmeter 2.11 and I am seeing this same issue with the regex extractor.
Could you tell us exactly what you have done, what you expect and what you have seen? Maybe your server is giving out no or the wrong encoding?
Closed as no feedback from user and looking at current code , encoding is correctly used.
This issue has been migrated to GitHub: https://github.com/apache/jmeter/issues/1615