Bug 65677 - InvalidMarkException in Http11InputBuffer.fill when socket read throws AsynchronousCloseException
Summary: InvalidMarkException in Http11InputBuffer.fill when socket read throws Asynch...
Status: RESOLVED FIXED
Alias: None
Product: Tomcat 9
Classification: Unclassified
Component: Connectors (show other bugs)
Version: 9.0.54
Hardware: PC Linux
: P2 normal (vote)
Target Milestone: -----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-10 10:39 UTC by Andreas Grub
Modified: 2021-11-25 12:11 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Grub 2021-11-10 10:39:16 UTC
When reading a CoyoteInputStream during a chunked Http/1.1
POST request, I've encountered a java.nio.InvalidMarkException
under certain conditions. Here's the stacktrace I observe:

java.nio.InvalidMarkException: null
	at java.nio.Buffer.reset(Unknown Source)
	at java.nio.ByteBuffer.reset(Unknown Source)
	at java.nio.ByteBuffer.reset(Unknown Source)
	at o.a.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:813)
	at o.a.coyote.http11.Http11InputBuffer.access$400(Http11InputBuffer.java:42)
	at o.a.c.h.Http11InputBuffer$SocketInputBuffer.doRead(Http11InputBuffer.java:1172)
	at o.a.c.h.filters.ChunkedInputFilter.readBytes(ChunkedInputFilter.java:310)
	at o.a.c.h.filters.ChunkedInputFilter.parseChunkHeader(ChunkedInputFilter.java:338)
	at o.a.c.h.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:164)
	at o.a.coyote.http11.Http11InputBuffer.doRead(Http11InputBuffer.java:249)
	at org.apache.coyote.Request.doRead(Request.java:640)
	at o.a.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:317)
	at o.a.catalina.connector.InputBuffer.checkByteBufferEof(InputBuffer.java:600)
	at o.a.catalina.connector.InputBuffer.read(InputBuffer.java:340)
	at o.a.c.connector.CoyoteInputStream.read(CoyoteInputStream.java:132)

I currently assume that the thread reading the CoyoteInputStream is
interrupted because it's running within a Resilience4j TimeLimiter
and the POST request takes too long. This should be ok though and
I'd expect an InterruptedException or IOException in that case
from the read call, but instead I get an InvalidMarkException.

This InvalidMarkException is thrown in the "finally"
block of this part in Http11InputBuffer.fill:

byteBuffer.mark();
try {
    if (byteBuffer.position() < byteBuffer.limit()) {
        byteBuffer.position(byteBuffer.limit());
    }
    byteBuffer.limit(byteBuffer.capacity());
    SocketWrapperBase<?> socketWrapper = this.wrapper;
    if (socketWrapper != null) {
        nRead = socketWrapper.read(block, byteBuffer);
    } else {
        throw new CloseNowException(sm.getString("iib.eof.error"));
    }
} finally {
    // Ensure that the buffer limit and position are returned to a
    // consistent "ready for read" state if an error occurs during in
    // the above code block.
    byteBuffer.limit(byteBuffer.position()).reset();
}

I've instrumented that part with more debug logs as follows (in
particular, for upcoming Tomcat versions, it might be good to add at
least the catch block to enhance debugging of such issues in the future):

byteBuffer.mark();
if (log.isDebugEnabled()) {
    log.debug("Set mark at position " + byteBuffer.position());
}
try {
    if (byteBuffer.position() < byteBuffer.limit()) {
        if (log.isDebugEnabled()) {
            log.debug("Setting position to limit " + byteBuffer.limit());
        }
        byteBuffer.position(byteBuffer.limit());
    }
    byteBuffer.limit(byteBuffer.capacity());
    if (log.isDebugEnabled()) {
        log.debug("Position before read " + byteBuffer.position());
    }
    SocketWrapperBase<?> socketWrapper = this.wrapper;
    if (socketWrapper != null) {
        nRead = socketWrapper.read(block, byteBuffer);
    } else {
        throw new CloseNowException(sm.getString("iib.eof.error"));
    }
} catch (Throwable e) {
    if (log.isDebugEnabled()) {
        log.debug("Encountered exception in read during fill, position=" + byteBuffer.position(), e);
    }
    throw e;
} finally {
    // Ensure that the buffer limit and position are returned to a
    // consistent "ready for read" state if an error occurs during in
    // the above code block.
    if (log.isDebugEnabled()) {
        log.debug("Calling limit on byteBuffer with position " + byteBuffer.position());
    }
    byteBuffer.limit(byteBuffer.position()).reset();
}

When that exception occurs, I see the following logs (before
that, the POST request was already running for about 30 seconds
until the TimeLimiter kicked in and interrupted the thread):

10:03:18.899	Before fill(): parsingHeader: [false], parsingRequestLine: [false], parsingRequestLinePhase: [0], parsingRequestLineStart: [0], byteBuffer.position(): [3313], byteBuffer.limit(): [3313], end: [2019]
10:03:18.899	Set mark at position 2019
10:03:18.899	Position before read 2019
10:03:18.942	Calling limit on byteBuffer with position 3306

10:03:18.942	Before fill(): parsingHeader: [false], parsingRequestLine: [false], parsingRequestLinePhase: [0], parsingRequestLineStart: [0], byteBuffer.position(): [3306], byteBuffer.limit(): [3306], end: [2019]
10:03:18.942	Set mark at position 2019
10:03:18.942	Position before read 2019
10:03:19.007	Calling limit on byteBuffer with position 3306
10:03:19.007	Received [500 <binary output>]

10:03:19.007	Before fill(): parsingHeader: [false], parsingRequestLine: [false], parsingRequestLinePhase: [0], parsingRequestLineStart: [0], byteBuffer.position(): [3306], byteBuffer.limit(): [3306], end: [2019]
10:03:19.007	Set mark at position 2019
10:03:19.007	Position before read 2019
10:03:19.062	Calling limit on byteBuffer with position 3306
10:03:19.062	Received [500 <binary output>]

10:03:19.063	Before fill(): parsingHeader: [false], parsingRequestLine: [false], parsingRequestLinePhase: [0], parsingRequestLineStart: [0], byteBuffer.position(): [3306], byteBuffer.limit(): [3306], end: [2019]
10:03:19.063	Set mark at position 2019
10:03:19.063	Position before read 2019
10:03:19.131	Calling limit on byteBuffer with position 3306

10:03:19.132	Before fill(): parsingHeader: [false], parsingRequestLine: [false], parsingRequestLinePhase: [0], parsingRequestLineStart: [0], byteBuffer.position(): [3306], byteBuffer.limit(): [3306], end: [2019]
10:03:19.132	Set mark at position 2019
10:03:19.132	Position before read 2019
10:03:19.244	Calling limit on byteBuffer with position 3306

10:03:19.245	Before fill(): parsingHeader: [false], parsingRequestLine: [false], parsingRequestLinePhase: [0], parsingRequestLineStart: [0], byteBuffer.position(): [3306], byteBuffer.limit(): [3306], end: [2019]
10:03:19.245	Set mark at position 2019
10:03:19.245	Position before read 2019
10:03:19.854	Encountered exception in read during fill, position=0
                j.n.c.AsynchronousCloseException: null
                    at s.n.c.UnixAsynchronousSocketChannelImpl.finishRead(Unknown Source)
                    at s.n.c.UnixAsynchronousSocketChannelImpl.finish(Unknown Source)
                    ... 2 frames excluded
                    at o.apache.tomcat.util.net.Nio2Channel.close(Nio2Channel.java:81)
                    at o.apache.tomcat.util.net.Nio2Channel.close(Nio2Channel.java:94)
                    at o.a.t.u.n.Nio2Endpoint$Nio2SocketWrapper.doClose(Nio2Endpoint.java:935)
                    at o.a.t.util.net.SocketWrapperBase.close(SocketWrapperBase.java:422)
                    at o.a.t.u.n.Nio2Endpoint$SocketProcessor.doRun(Nio2Endpoint.java:1685)
                    at o.a.t.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
                    at o.a.tomcat.util.net.AbstractEndpoint.processSocket(AbstractEndpoint.java:1171)
                    at o.a.t.u.n.Nio2Endpoint$Nio2SocketWrapper$2.completed(Nio2Endpoint.java:613)
                    at o.a.t.u.n.Nio2Endpoint$Nio2SocketWrapper$2.completed(Nio2Endpoint.java:591)
                    ... 3 frames excluded
                    at o.a.t.u.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
                    at o.a.t.u.t.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
                    at o.a.t.u.t.TaskThread$WrappingRunnable.run(TaskThread.java:61)
                    ... 1 frames excluded
10:03:19.854	Calling limit on byteBuffer with position 0

So, the java.nio.channels.AsynchronousCloseException thrown presumably from
socketWrapper.read leaves the byteBuffer with position=0. So the limit is
set to 0 in the finally block, which sets the mark to -1, as its larger than
the limit (it was 2019 according to logs). Here's the snippet for Buffer.limit:

public Buffer limit(int newLimit) {
    if (newLimit > capacity | newLimit < 0)
        throw createLimitException(newLimit);
    limit = newLimit;
    if (position > newLimit) position = newLimit;
    if (mark > newLimit) mark = -1;
    return this;
}

This in turn makes the reset() call fail
on the byte buffer as the mark is invalid.

I've also observed that the above logs are sometimes missing
the "Received" log. I assume that this is swallowed by my
logging consumer as either the message is too large or the "raw
bytes" make it choke. But I think that shouldn't worry us now!

Now the question is: Is this really a bug in Tomcat? Do you agree that
the limit(0) call above should not happen? Is it reasonable that a thread
interruption could lead to the observed AsynchronousCloseException above?

I've tried to reproduce it locally, but currently I can only observe
it on a production system. I appreciate hints what could be going on or
how one could try to reproduce this within a more isolated environment.

Let me know if you need further information.
Comment 1 Andreas Grub 2021-11-10 11:44:38 UTC
The runtime where this was observed is (it's a dockerized Spring Boot 2.5.6 application):

java -version
Picked up JAVA_TOOL_OPTIONS: -XX:MinRAMPercentage=75 -XX:MaxRAMPercentage=75 -XX:-OmitStackTraceInFastThrow -XX:+ExitOnOutOfMemoryError
openjdk version "11.0.13" 2021-10-19 LTS
OpenJDK Runtime Environment 21.10-(Zulu-11.52+13-linux_x64)-Microsoft-Azure-restricted (build 11.0.13+8-LTS)
OpenJDK 64-Bit Server VM 21.10-(Zulu-11.52+13-linux_x64)-Microsoft-Azure-restricted (build 11.0.13+8-LTS, mixed mode)
Comment 2 Mark Thomas 2021-11-25 12:11:49 UTC
Thanks for the report and the analysis. It is rather odd that the position gets set to zero but fixing it is relatively simple.

Fixed in:
- 10.1.x for 10.1.0-M8 onwards
- 10.0.x for 10.0.14 onwards
- 9.0.x for 9.0.56 onwards
- 8.5.x for 8.5.74 onwards