Summary says the problem - it does not work well with cyrillic. The fix is: ParamSupport.java (common/core), line 123: try { if (encode) { parent.addParameter( URLEncoder.encode(name, pageContext.getResponse().getCharacterEncoding()), URLEncoder.encode(value, pageContext.getResponse().getCharacterEncoding())); } else parent.addParameter(name, value); } catch (java.io.UnsupportedEncodingException e) {throw new JspException(e.toString());}
Thanks for the bug report. Fix is more elaborate than the one suggested because URLEncoder.encode(String, String) is new since J2SE 1.4 and JSTL 1.0 must also run on previous releases of J2SE.
From my understanding of the documentation for URLEncoder and the implementation notes of the HTML spec (http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars), the parameter name and value should be encoded using UTF-8, not using the document encoding as the current fix does. Looking at the Tomcat 4.x sources (http://cvs.apache.org/viewcvs.cgi/jakarta-tomcat-4.0/catalina/src/share/org/apa che/catalina/connector/HttpRequestBase.java), it seems it uses the document encoding for parameter decoding, but shouldn't there at least be an option to specify the parameter encoding to use?
This would be more correct as soon as this would be parsed correctly by tomcat. For now next test: <c:out value="${param.param}"/> <p><a href='test3.jsp?param=<%= java.net.URLEncoder.encode("Привет", "UTF-8") %>'>Click</a> (file is test3.jsp) gives test3.jsp?param=%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82 URL that is parsed correctly in Mozilla (displayed OK in status bar), but incorrectly in Tomcat - а�б�аИаВаЕб� is displayed by c:out instead of Привет. Note that the test3.jsp also has next statements (that allows me to use cyrillic correctly): <%@ page pageEncoding="windows-1251" %> <% if (request.getCharacterEncoding() == null) request.setCharacterEncoding(response.getCharacterEncoding()); %>
Tha way you fixed this bug will work on JDK 1.4 but will fallback to old behavior on JDK 1.3. In fact that means bug is not fixed. It would be better to implement your own urlEncode implementation inside JSTL and use it on JDK 1.3. Simplest way is to use URLEncoder.encode source from 1.4
*** Bug 19477 has been marked as a duplicate of this bug. ***
Stefan is right about the HTML spec. However, this part of the HTML spec was apparently produced too late to have an impact on reality. Browsers generally encode the query string using the character encoding of the page containing the form. Moreover, the JSP 2.0 spec also adopts this convention for internally generated query strings. It therefore seems wise to follow suit with what everyone else is doing. I've updated the code to do the encoding as follows: Util.URLEncode(name, enc) where the URLEncode method has been lifted from the Jasper2 source code (we now use the same code for both J2SE 1.3 and J2SE 1.4), and where enc is 'pageContext.getResponse().getCharacterEncoding()'.