Bug 53353 - Malformed contentType attribute results in two charset values
Summary: Malformed contentType attribute results in two charset values
Alias: None
Product: Tomcat 7
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 7.0.27
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
Depends on:
Reported: 2012-06-03 17:55 UTC by Konstantin Kolinko
Modified: 2012-06-03 20:05 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Konstantin Kolinko 2012-06-03 17:55:37 UTC
If contentType attribute of a JSP page has a broken value, Tomcat 7 can behave strangely and send two charset values in Content-Type header.

To reproduce:
1. Create this simple JSP file, ROOT/test.jsp
Note, that there is a typo: "UTF-8" instead of "charset=UTF-8". It is what triggers this issue.

<%@page pageEncoding="UTF-8" contentType="text/html; UTF-8" %>
Hello world!

2. Start Tomcat and access the page with Firefox

(I am using version 12, with Live HTTP Headers addon). When the page loads: right-click -> Page info -> look for the value of Encoding. Then look for the value of Content-Type header.

With current Tomcat 6.0:
Encoding: UTF-8
Content-Type header: text/html; UTF-8;charset=UTF-8

With current Tomcat 7.0 (7.0.23):
Encoding: ISO-8859-1
Content-Type header: text/html; UTF-8;charset=UTF-8;charset=ISO-8859-1


I think it is related to new contentType header parser (fix for bug 52811: r1300154 + r1300155 + r1304275 + r1304895).

It is not a very convincing example, but it looks like it confirms the fears against backporting the fix for bug 52811 to 6.0.
Comment 1 Konstantin Kolinko 2012-06-03 18:03:42 UTC
> With current Tomcat 7.0 (7.0.23)

I meant 7.0.27. The issue is reproducible with 7.0.27 release and with current trunk of 7.0.x.
Comment 2 Mark Thomas 2012-06-03 19:29:24 UTC
At one level, this is just a case of "garbage in, garbage out" with current 7.0.x producing different garbage that 6.0.x for the same input. Granted, the 7.0.x garbage is likely to cause more problems for clients.

Digging a little deeper, it appears that Jasper is making the same error as the root cause of bug 52811, namely using contentType.indexOf("charset=") < 0. That is probably more forgivable in Jasper than it was in Tomcat.

I'll see if I can configure the parser to handle parameters of the form "name" rather than "name=value". That should make Tomcat a little more robust against this sort of input. Since the input is invalid, the specs don't say how this should be handled so we have a little latitude here.
Comment 3 Mark Thomas 2012-06-03 20:05:06 UTC
Fixed in trunk and 7.0.x and will be included in 7.0.28 onwards.

Invalid parameters in the content-type header value will now be ignored. The resulting header for 7.0.x of the input above is:

Content-Type header: text/html;charset=UTF-8