Bug 28875

Summary: Multi-byte characters in default error page aren't printed out correctly.
Product: Tomcat 5 Reporter: Kan Ogawa <super-creek>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Status: CLOSED FIXED    
Severity: normal CC: bien, super-creek
Priority: P1    
Version: Nightly Build   
Target Milestone: ---   
Hardware: All   
OS: All   
Attachments: The proposed patch to fix this bug in ErrorReportValve.
The ErrorReportValve class file that applied the patch.
A test case that returns HTTP 404 status simply, without the first patch.
A test case that returns HTTP 404 status simply, with the first patch.
The re-written patch to fix this problem in ErrorReportValve.

Description Kan Ogawa 2004-05-10 15:25:18 UTC
Hi, 

When working the latest Tomcat 5, any multi-byte characters in the default 
error page using ErrorReportValve are not printed out correctly.

As a result of detail investigation, I found that an error report had 
been written to the writer of response which didn't applied the 
suitable character encoding in ErrorReportValve.

So, I suggest a solution to ensure that every multi-byte character 
is normally printed out in this page, and send the following patch. 


Index: jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java
===================================================================
RCS file: /home/cvspublic/jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java,v
retrieving revision 1.16
diff -u -w -r1.16 ErrorReportValve.java
--- jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java
	8 May 2004 22:33:22 -0000	1.16
+++ jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java
	10 May 2004 10:40:52 -0000
@@ -308,19 +308,16 @@
 
         try {
 
-            Writer writer = response.getReporter();
-
-            if (writer != null) {
-
-                Locale locale = Locale.getDefault();
-
                 try {
-                    hres.setContentType("text/html");
-                    hres.setLocale(locale);
+                hres.setContentType("text/html; charset=utf-8");
                 } catch (Throwable t) {
                     if (debug >= 1)
                         log("status.setContentType", t);
                 }
+
+            Writer writer = response.getReporter();
+
+            if (writer != null) {
 
                 // If writer is null, it's an indication that the response has
                 // been hard committed already, which should never happen


I hope that this patch will be applied before next release.

Regards,
Comment 1 Kan Ogawa 2004-05-10 15:29:08 UTC
Created attachment 11496 [details]
The proposed patch to fix this bug in ErrorReportValve.
Comment 2 Kan Ogawa 2004-05-10 15:34:53 UTC
Created attachment 11497 [details]
The ErrorReportValve class file that applied the patch.
Comment 3 Jan Luehe 2004-05-10 19:38:20 UTC
Kan,

I thought revision 1.11 of CoyoteResponse.java already fixed this issue.

CVS log:

revision 1.11
date: 2004/01/20 19:39:00;  author: luehe;  state: Exp;  lines: +8 -5
Fix for Bugtraq 4655010: Method sendError() of class HttpServletResponse
                         does not send multi byte data

Use response encoding when generating error report, so that the following
code works as expected:

      response.setCharacterEncoding(<charset>);
      response.sendError(<errorCode>, <message>);


Here are the diffs for that revision:

Index: CoyoteResponse.java
===================================================================
RCS file:
/home/cvs/jakarta-tomcat-catalina/catalina/src/share/org/apache/coyote/tomcat5/CoyoteResponse.java,v
retrieving revision 1.10
retrieving revision 1.11
diff -u -r1.10 -r1.11
--- CoyoteResponse.java 15 Oct 2003 18:47:49 -0000      1.10
+++ CoyoteResponse.java 20 Jan 2004 19:39:00 -0000      1.11
@@ -1,7 +1,7 @@
 /*
- * $Header:
/home/cvs/jakarta-tomcat-catalina/catalina/src/share/org/apache/coyote/tomcat5/CoyoteResponse.java,v
1.10 2003/10/15 18:47:49 jfarcand Exp $
- * $Revision: 1.10 $
- * $Date: 2003/10/15 18:47:49 $
+ * $Header:
/home/cvs/jakarta-tomcat-catalina/catalina/src/share/org/apache/coyote/tomcat5/CoyoteResponse.java,v
1.11 2004/01/20 19:39:00 luehe Exp $
+ * $Revision: 1.11 $
+ * $Date: 2004/01/20 19:39:00 $
  *
  * ====================================================================
  *
@@ -107,7 +107,7 @@
  *
  * @author Remy Maucherat
  * @author Craig R. McClanahan
- * @version $Revision: 1.10 $ $Date: 2003/10/15 18:47:49 $
+ * @version $Revision: 1.11 $ $Date: 2004/01/20 19:39:00 $
  */
 
 public class CoyoteResponse
@@ -529,9 +529,12 @@
      * unexpected exception thrown during the servlet processing
      * (and only in that case), null will be returned if the response stream
      * has already been used.
+     *
+     * @exception IOException if an input/output error occurs
      */
-    public PrintWriter getReporter() {
+    public PrintWriter getReporter() throws IOException {
         if (outputBuffer.isNew()) {
+            outputBuffer.checkConverter();
             return writer;
         } else {
             return null;


Do you have any test case to prove that this is still an issue? If so, please
attach.

Thanks,

Jan
Comment 4 Kan Ogawa 2004-05-11 01:22:32 UTC
Jan,

I attach screenshot of a test case that returns HTTP 404 status simply.
Until my first proposed patch is applied, this issue always occurs.


Regards,
Comment 5 Kan Ogawa 2004-05-11 01:31:54 UTC
Created attachment 11507 [details]
A test case that returns HTTP 404 status simply, without the first patch.
Comment 6 Kan Ogawa 2004-05-11 01:34:57 UTC
Created attachment 11508 [details]
A test case that returns HTTP 404 status simply, with the first patch.
Comment 7 Kan Ogawa 2004-07-20 07:38:02 UTC
This problem isn't yet solved in Tomcat 5.0.27 release.

It doesn't occur in 4.1.30 release, but always happens in 5.0.x release.
Any multi-byte character is not correctly printed into response reporter, 
independently of the response locale setting.

Therefore, it is impossible to read every multi-byte error messages which is 
displayed in default error page.
(And many Japanese users are in trouble very much.)

According to current ErrorReportValve source code, I have re-written my first 
proposed patch (see the following diff code).
Please apply it.

Index: jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java
===================================================================
RCS file: /home/cvspublic/jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java,v
retrieving revision 1.21
diff -u -w -r1.21 ErrorReportValve.java
--- jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java
	27 Jun 2004 23:56:23 -0000	1.21
+++ jakarta-tomcat-
catalina/catalina/src/share/org/apache/catalina/valves/ErrorReportValve.java
	17 Jul 2004 20:59:20 -0000
@@ -284,19 +284,17 @@
 
         try {
 
-            Writer writer = response.getReporter();
-
-            if (writer != null) {
-
-                Locale locale = Locale.getDefault();
-
                 try {
                 	response.setContentType("text/html");
-                	response.setLocale(locale);
+                response.setCharacterEncoding("utf-8");
                 } catch (Throwable t) {
                     if (container.getLogger().isDebugEnabled())
                         container.getLogger().debug("status.setContentType", 
t);
                 }
+
+            Writer writer = response.getReporter();
+
+            if (writer != null) {
 
                 // If writer is null, it's an indication that the response has
                 // been hard committed already, which should never happen
Comment 8 Kan Ogawa 2004-07-21 04:31:23 UTC
Created attachment 12178 [details]
The re-written patch to fix this problem in ErrorReportValve.
Comment 9 Moriyoshi Koizumi 2004-07-21 16:19:19 UTC
I haven't been looking into this problem much and maybe I'm wrong somewhat.

Regarding the latest patch, I'm wondering if there is any reason to drop setLocale()
from that portion.

Invocation of either setCharacterEncoding() or setContentType() that precedes setLocale()
seems to prevent setLocale() from changing the character encoding for the request.
Comment 10 Kan Ogawa 2004-07-28 05:18:07 UTC
Thanks, Moriyoshi.

Pondering carefully, the root cause is only that response.getReporter() 
method has been invoked before content type and locale setter methods of 
response are called.
Once the response.getReporter() method is invoked, the setLocale() method and 
the setCharacterEncoding() method of the response will process nothing.

Therefore, I think that so far my submitted patches are basically no wrong.
When the response.getReporter() method is invoked after setting content type 
and 
appropriate character encoding into this response, this issue surely will be 
solved.


Regards,

Kan Ogawa
Comment 11 Yoav Shapira 2004-07-28 15:02:08 UTC
I've applied the re-written patch.  I think it looks OK, but I haven't re-built 
and re-run your test cases yet.  If you could do that when you get a chance, 
that'd be great.  Thanks for contributing the patch.
Comment 12 Kan Ogawa 2004-07-30 02:24:01 UTC
Yoav, 

I downloaded Tomcat 5 nightly distribution (2004-07-28 Build), and have proved 
that this problem was fixed in Japanese language environment.

Thanks,
Kan Ogawa
Comment 13 Yoav Shapira 2004-08-09 12:57:34 UTC
Good, thanks for checking.  I'm verifying this issue.
Comment 14 Yoav Shapira 2004-08-28 13:00:30 UTC
*** Bug 30691 has been marked as a duplicate of this bug. ***