Bug 49960 - HttpServletRequest.getCharacterEncoding does not break up the Content-Disposition header well.
Summary: HttpServletRequest.getCharacterEncoding does not break up the Content-Disposi...
Status: RESOLVED DUPLICATE of bug 42119
Alias: None
Product: Tomcat 5
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 5.5.23
Hardware: All All
: P2 normal (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
Depends on:
Reported: 2010-09-20 06:07 UTC by Nige
Modified: 2010-09-22 05:00 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Nige 2010-09-20 06:07:58 UTC
This probably affects all versions, not just 5.5.23.

It comes down to the utility class org.apache.tomcat.util.http.ContentType

It has the code:

    // Basically return everything after ";charset="
    // If no charset specified, use the HTTP default (ASCII) character set.
    public static String getCharsetFromContentType(String type) {

Which is basically lazy. It can't use everything after ";charset=". It *must* parse it and see if there is another ";"

Because if someone sets the Content-Disposition header in an XHR, it might BY NO FAULT OF THAT DEVELOPER end up as

Content-Type: multipart/form-data; charset=UTF-8; boundary=--------------------ext-ux-upload-boundary

That's taken directly from the Fiddler debugging tool. The browser (Firefox) INSERTED its encoding name (UTF-8 is mandated for XHRs) within the header which I specified. It did not append it, it INSERTED it.

Now, the character set ends up as "UTF-8; boundary=--------------------ext-ux-upload-boundary"
Comment 1 Tim Whittington 2010-09-22 05:00:53 UTC
This was fixed in #42119 (in 2007) - updating to a recent 5.5.x (or preferrably to a 6.0.x) will resolve the issue.

*** This bug has been marked as a duplicate of bug 42119 ***