Bug 54248 - Retrieving content that contains a BOM via request.getReader() issue
Summary: Retrieving content that contains a BOM via request.getReader() issue
Alias: None
Product: Tomcat 6
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 6.0.32
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: default
Assignee: Tomcat Developers Mailing List
Depends on:
Reported: 2012-12-05 11:56 UTC by Dave Stewart
Modified: 2013-02-09 00:05 UTC (History)
0 users


Note You need to log in before you can comment on or make changes to this bug.
Description Dave Stewart 2012-12-05 11:56:08 UTC
If a request contains Content that includes the BOM - in my case Content-Type: application/xml; charset-utf-16 - and the content has a BOM (FF FE), when fetched utilizing the BufferedReader from getReader() - the first request serviced on the AJP thread works correctly (encoded correctly, only characters fetched), however, any subsequent request on the thread which includes the BOM has the BOM being delivered to the application. It appears via review of the Tomcat code, that the recycle() method in B2CConverter simply ensures the socket's data has been completely flushed and the underlying InputStream doesn't get reset (don't really know if there is a way to do this without re-instantiating it) to ensure subsequent requests BOM is consumed. I proved this as a test by adding a call to reset() within the recycle() method which re-instantiates the underlying InputStreams and the problem resolved itself. 

I've temporarily resolved the issue in my application code by using request.getInputStream() and using request.getCharacterEncoding() and encoding the content inside my application.
Comment 1 Mark Thomas 2013-01-07 23:09:22 UTC
Thanks for the report.

The fix is to ensure that the decoder is reset between requests when the converter is recycled.

The fix has been applied to trunk and 7.0.x and will be included in 7.0.35 onwards.

The fix has been proposed for 6.0.x.
Comment 2 Konstantin Kolinko 2013-02-09 00:05:49 UTC
Fixed in 6.0 with r1444292 and will be in 6.0.37 onwards