Bug 52811 - HttpServletResponse.setContentType() parses the content type incorrectly
Summary: HttpServletResponse.setContentType() parses the content type incorrectly
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 6
Classification: Unclassified
Component: Servlet & JSP API (show other bugs)
Version: 6.0.29
Hardware: PC Windows XP
: P2 normal (vote)
Target Milestone: default
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-03-02 12:56 UTC by Martin Havel
Modified: 2014-02-17 13:50 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Havel 2012-03-02 12:56:46 UTC
When creating the HttpServletResponse the setContentType(type) is used for setting the content type and character encoding.
If the type is for example:

multipart/related;boundary=1_4F50BD36_CDF8C28;Start="<31671603.smil>";Type="application/smil;charset=UTF-8"

it is parsed and the content type is set to:

multipart/related;boundary=1_4F50BD36_CDF8C28;Start="<31671603.smil>";Type="application/smil

and character encoding to:

UTF-8

I believe it is incorrect, the content type should be:

multipart/related;boundary=1_4F50BD36_CDF8C28;Start="<31671603.smil>";Type="application/smil;charset=UTF-8"

and the character encoding should be set to default (ISO-8859-1).
Comment 1 Mark Thomas 2012-03-13 14:48:09 UTC
That was fun. Lots of sneaky edge cases parsing that little lot. Ended up implementing a new HTTP header parser.

Fixed in trunk and 7.0.x and will be included in 7.0.27 onwards.

I have proposed the fix for 6.0.x.
Comment 2 hari 2012-07-23 10:55:46 UTC
(In reply to comment #1)
> That was fun. Lots of sneaky edge cases parsing that little lot. Ended up
> implementing a new HTTP header parser.
> 
> Fixed in trunk and 7.0.x and will be included in 7.0.27 onwards.
> 
> I have proposed the fix for 6.0.x.

I do have similar issue and tested with latest Tomcat 7 code. What I observed is:

1. When the content type is ending with  'start-info="text/xml;charset=UTF-8"', at the browser I am getting: 'start-info="text/xml;charset=UTF-8";charset=ISO-8859-1'. charset=ISO-8859-1 is appended to content type.

2. When the content type is ending with 'start-info="text/xml"; charset=UTF-8', at the browser I am getting the same thing. There is not problem with this case.

My question is:

In case 1: even though charset exist (which is mentioned as UTF-8) in content type string, I am getting another extra charset (which is ISO-8859-1). Is it the expected behavior? Does the given content type is considered to as invalid/unknown content type?
Comment 3 Mark Thomas 2012-07-23 12:06:46 UTC
(In reply to comment #2)
> My question is:

Those are all questions for the users list. There is no Tomcat bug in the behavior you describe.
Comment 4 Mark Thomas 2012-10-01 08:41:59 UTC
The back-port proposal has been withdrawn due to a lack of support. A simpler solution is required.
Comment 5 Mark Thomas 2013-07-31 16:42:24 UTC
The original proposal used a pasrer generated by javacc. That has since been replaced in trunk and 7.0.x with a simpler implemenation. I have proposed a back-port of that simpler implementation.
Comment 6 Mark Thomas 2013-12-07 21:19:37 UTC
The backport of the simpler implementation has been committed and will be included in 6.0.38 onwards.