Bug 54248

Summary: Retrieving content that contains a BOM via request.getReader() issue
Product: Tomcat 6 Reporter: Dave Stewart <davids>
Component: CatalinaAssignee: Tomcat Developers Mailing List <dev>
Severity: normal    
Priority: P2    
Version: 6.0.32   
Target Milestone: default   
Hardware: PC   
OS: Linux   

Description Dave Stewart 2012-12-05 11:56:08 UTC
If a request contains Content that includes the BOM - in my case Content-Type: application/xml; charset-utf-16 - and the content has a BOM (FF FE), when fetched utilizing the BufferedReader from getReader() - the first request serviced on the AJP thread works correctly (encoded correctly, only characters fetched), however, any subsequent request on the thread which includes the BOM has the BOM being delivered to the application. It appears via review of the Tomcat code, that the recycle() method in B2CConverter simply ensures the socket's data has been completely flushed and the underlying InputStream doesn't get reset (don't really know if there is a way to do this without re-instantiating it) to ensure subsequent requests BOM is consumed. I proved this as a test by adding a call to reset() within the recycle() method which re-instantiates the underlying InputStreams and the problem resolved itself. 

I've temporarily resolved the issue in my application code by using request.getInputStream() and using request.getCharacterEncoding() and encoding the content inside my application.
Comment 1 Mark Thomas 2013-01-07 23:09:22 UTC
Thanks for the report.

The fix is to ensure that the decoder is reset between requests when the converter is recycled.

The fix has been applied to trunk and 7.0.x and will be included in 7.0.35 onwards.

The fix has been proposed for 6.0.x.
Comment 2 Konstantin Kolinko 2013-02-09 00:05:49 UTC
Fixed in 6.0 with r1444292 and will be in 6.0.37 onwards