I have the following text in a JSP with no encoding declared: Copyright © 2003 by The copyright symbol does not come out correctly. I checked the file and the stored character code is 0xA9 which is correct for ISO-8859-1. I checked the JSP 2.0 spec and the default encoding is still supposed to be ISO-8859-1. Yet the result is Copyright © 2003 by This can be easily reproduced by creating an empty web application and putting just Copyright © 2003 by into a jsp file. Looking at the generated Java file I see out.write("Copyright © 2003 by\r\n");
The reason for the seemly extra character in out.write() is that the java file Jasper generated is defaulted to utf8 encoding (0xa9 is 0xc2a9 in utf8). However, you cannot just set the Jasper option javaEncoding to ISO-5589-1, because you'll then also need to set a javac option to use the same encoding, and currently you cannot change the javac source encoding. Looks like the only way for this to work is to use urt8 for both the page and the response. :(
On second thought, what I said wasn't entirely correct. :( Jasper is still correct to output the bytes 0xc2a9 because of the difference in encoding. However, druing javac compilation, these bytes should be translated back to the original 0xa9, and everything should be OK. The problem seems to be in the JDT compiler, or its setup codes. As a work around, use ant compilation to use JDK 1.4 javac.
I have a fix for this that I will commit very shortly. I came across whilst looking at another bug.
Patch committed and verified.