Bug 24557

Summary: Request encodings doesn't work.
Product: Tomcat 4 Reporter: Kamil Nagrodzki <knagrodzki>
Component: Connector:Coyote HTTP/1.1Assignee: Tomcat Developers Mailing List <dev>
Status: CLOSED INVALID    
Severity: major    
Priority: P3    
Version: 4.1.29   
Target Milestone: ---   
Hardware: All   
OS: All   

Description Kamil Nagrodzki 2003-11-10 11:00:14 UTC
Servlets using method
HttpServletRequest.setCharacterEncoding("iso-8859-2");
can't decode Polish characters.

The same seevlets works on Tomcat 4.1.27
Comment 1 Pierre Dittgen 2003-11-10 14:01:23 UTC
I have the same problem using Cocoon on Jakarta Tomcat 4.1.29, charset in HTTP
Header is set to ISO-8859-1. My UTF-8 pages are seen as ISO-LATIN-1 by my
browser... It worked under Jakarta Tomcat 4.1.27 
Comment 2 Remy Maucherat 2003-11-10 14:14:18 UTC
There's no standard for HTTP header encoding, so any solution will not be
portable. If you want it to be relatively reliable, you should IMO URL encode
them (the usual %xx scheme).
Basically, setCharacterEncoding will only (reliably) apply to the POSTed
parameters. Tomcat 5.0 has a connector global setting to configure URI encoding
(after %xx decoding, which is obviously still mandatory).
Comment 3 Kamil Nagrodzki 2003-11-10 15:43:03 UTC
It doesn't work for forms and POST method!
Comment 4 Remy Maucherat 2003-11-15 07:46:14 UTC
*** Bug 24721 has been marked as a duplicate of this bug. ***
Comment 5 Remy Maucherat 2003-11-15 07:48:16 UTC
I have verified the proper encoding is used to decode the parameters, as long as
they are in the POST body.
Please do not reopen this report (or attach a ready to run test case
demonstrating the "issue").
Comment 6 Mirek Hankus 2003-11-18 09:42:24 UTC
But what about GET method. It is not working even that it is recommended way
of doing things in servlet 2.3 spec.

http://developer.java.sun.com/developer/qow/archive/179/index.jsp

Here is sample of code which worked fine in previous tomcats, but not in 4.1.29

 response.setContentType("text/html; charset=utf-8");
        PrintWriter out = response.getWriter();
        request.setCharacterEncoding("UTF-8");
        String zolc="żółć";
        out.println("Input string:"+zolc);
        out.println("<br>");
        out.println("Press <a
href=\"?z="+URLEncoder.encode(zolc,"UTF-8")+"\">link to see how parameter is
encoded</a><br>");
        out.println("Parameter z value is:");
        out.println(request.getParameter("z"));
        out.close();

Please contact me if you need to see how it works on 4.1.27 and 4.1.29 (I have
it installed on 2 servers).



Comment 7 Han Min 2003-11-26 11:37:48 UTC
The org.apache.coyote.tomcat4.CoyoteRequest class missed the key codes for 
setting encoding value before call the function 
named "parameters.handleQueryParameters()" in method "parseRequestParameters
()". The method "handleQueryParameters()" in 
org.apache.tomcat.util.http.Parameters uses the encoding value setting by the 
function named "setQueryStringEncoding(String s)" to convert byte array of the 
request inputStream to normal string.

The correct codes shows as following: 
 
    protected void parseRequestParameters() {
        requestParametersParsed = true;
        Parameters parameters = coyoteRequest.getParameters();
        String enc = coyoteRequest.getCharacterEncoding();
        if (enc != null) {
            parameters.setEncoding(enc);
            //Here set the query string encoding
            parameters.setQueryStringEncoding(enc);//Here set the queryParm enc
        } else {
            parameters.setEncoding
                (org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING);
            //Here set the query string encoding
            parameters.setQueryStringEncoding 
                (org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING);
        }

        parameters.handleQueryParameters();
        .........