62405 – Add Rereadable Request Filter

Bug 62405 - Add Rereadable Request Filter

Summary: Add Rereadable Request Filter

Status:	NEW

Alias:	None

Product:	Tomcat 9
Classification:	Unclassified
Component:	Catalina (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 enhancement (vote)
Target Milestone:	-----
Assignee:	Tomcat Developers Mailing List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2018-05-23 22:41 UTC by Igal Sapir
Modified:	2018-05-24 18:28 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Igal Sapir 2018-05-23 22:41:46 UTC

Many times Filters need to read the body of the Request in order to inspect it, e.g. a security filter that might inspect incoming request for XSS or SQL Injection values.

But if that filter is not written properly, inspecting the request by calling 
getInputStream() or getReader(), will put the Request in an illigal state for subsequent reads, and if the Servlet or any other filter in the chain will try to call getReader() again an IllegalStateException will be thrown:

From
https://docs.oracle.com/javaee/7/api/javax/servlet/ServletRequest.html#getInputStream--
> IllegalStateException - if the getReader() method has already been called for this request

https://docs.oracle.com/javaee/7/api/javax/servlet/ServletRequest.html#getReader--
> IllegalStateException - if getInputStream() method has been called on this request

I propose to add a general purpose, RereadableRequestFilter (working title), that will allow to re-read a request's body by caching it on the first read, and returning the value from cache on subsequent reads.

That way a Filter that need to inspect the Request can simply wrap it with the RereadableRequestFilter and not worry about those details.

I already have the code for such a filter which I've written a while back, so I can tweak it as needed and add it rather easily if there is no objection for this enhancement.

Comment 1 George Stanchev 2018-05-24 03:20:54 UTC

I am just curious, what happens if a bad actor decides to send a 10 gig request and the filter is engaged. Obviously you have to read the whole thing to memory in order to rewind it or you have a cap on how much you read from the socket?

Comment 2 Igal Sapir 2018-05-24 03:31:37 UTC

(In reply to George Stanchev from comment #1)
> I am just curious, what happens if a bad actor decides to send a 10 gig
> request and the filter is engaged. Obviously you have to read the whole
> thing to memory in order to rewind it or you have a cap on how much you read
> from the socket?

I don't have that part implemented, but it's possible to add configuration settings with a size limit that will throw an error, or even a size threshold that will use disk instead of memory.

Comment 3 Mark Thomas 2018-05-24 09:04:24 UTC

This has come up before. A few of the past threads:
http://tomcat.markmail.org/thread/fumpfuspt7a3nesz
http://tomcat.markmail.org/thread/tqig7tldxjrra3bh

It would be worth a more thorough search of the archives for other previous discussions.

The short version is doing this generically is non-trivial.

Comment 4 Igal Sapir 2018-05-24 16:42:11 UTC

(In reply to Mark Thomas from comment #3)
> This has come up before. A few of the past threads:
> http://tomcat.markmail.org/thread/fumpfuspt7a3nesz
> http://tomcat.markmail.org/thread/tqig7tldxjrra3bh

Going over these threads I see that the main issue is if the Filter wraps getReader() and getInputStream() but not getParameter() and getParameterMap().  From the 2nd link above:

>> Consider the following:
>> 
>> Tomcat provides request R.
>> Filter reads request body using R.getInputStream().
>> Filter caches request body.
>> Filter wraps request R to provide R', over-riding getInputStream() to
>> provide the cached body.
>> Filter passes R' to the application.
>> Application calls R'.getParameter()
>> R'.getParameter() calls R.getParameter()
>> Keep in mind at this point R has zero knowledge of R'.
>>
>> R calls getInputStream() to read request body but that InputStream has
>> already been read.
>> 
>> The problem is the wrapper, R'. Over-riding getInputStream() is not
>> enough. It needs to over-ride every method that may access that
>> InputStream. Which is non-trivial because it means re-implementing a lot
>> of functionality the container would normally provide for you out of the
>> box.

My implementation [1] does provide a wrapper for getParameter() #L95 and getParameterMap() #L111 in the Filter, so I believe that that issue is taken care of.

> It would be worth a more thorough search of the archives for other previous
> discussions.

Unfortunately the search capability of the archives is not that great.

[1] https://github.com/isapir/servlet-filter-utils/blob/master/src/main/java/net/twentyonesolutions/servlet/util/RereadableServletRequest.java#L95

Comment 5 Mark Thomas 2018-05-24 18:17:53 UTC

I usually head towards tomcat.markmail.org when I need to search the archives.

From memory, getPart() and getParts() also caused complications.

I don't think getTrailerFields() etc. was part of any of those discussions either. I haven't looked to see if it plays nicely with the buffering solution.

Comment 6 Igal Sapir 2018-05-24 18:28:11 UTC

(In reply to Mark Thomas from comment #5)
> I usually head towards tomcat.markmail.org when I need to search the
> archives.
Good to know.  I was looking at http://tomcat.apache.org/lists.html and under
archives I saw

>> Archives: at Apache, at MARC (searchable), at MarkMail, at Nabble 
>> and at Mail Archive
so I thought that only MARC is searchable and didn't bother to check the others.
MarkMail seems to have a better search than MARC.

I will update the lists.html page on the site.

> From memory, getPart() and getParts() also caused complications.
> 
> I don't think getTrailerFields() etc. was part of any of those discussions
> either. I haven't looked to see if it plays nicely with the buffering
> solution.
I will look into those and update this thread when I have more information.