Bug 61524 - NullPointerException in Http11OutputBuffer
Summary: NullPointerException in Http11OutputBuffer
Status: RESOLVED WORKSFORME
Alias: None
Product: Tomcat 8
Classification: Unclassified
Component: Catalina (show other bugs)
Version: 8.5.20
Hardware: PC Linux
: P2 major (vote)
Target Milestone: ----
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-15 12:06 UTC by Boris Petrov
Modified: 2018-05-30 07:58 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Boris Petrov 2017-09-15 12:06:22 UTC
Hello, this has been happening on ALL 8.5.* versions (I've been upgrading to the newest version for a while now) and possible for a long time before that. Jersey's version is 2.25.1, Tomcat is 8.5.20. This is the stacktrace:

java.lang.NullPointerException: null
        at org.apache.coyote.http11.Http11OutputBuffer.commit(Http11OutputBuffer.java:368)
        at org.apache.coyote.http11.Http11Processor.prepareResponse(Http11Processor.java:1304)
        at org.apache.coyote.AbstractProcessor.action(AbstractProcessor.java:258)
        at org.apache.coyote.Response.action(Response.java:175)
        at org.apache.coyote.Response.sendHeaders(Response.java:357)
        at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:303)
        at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:284)
        at org.apache.catalina.connector.CoyoteOutputStream.flush(CoyoteOutputStream.java:118)
        at org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.flush(ResponseWriter.java:330)
        at org.glassfish.jersey.message.internal.CommittingOutputStream.flush(CommittingOutputStream.java:292)
        at org.glassfish.jersey.message.internal.OutboundMessageContext.close(OutboundMessageContext.java:876)
        at org.glassfish.jersey.server.ContainerResponse.close(ContainerResponse.java:412)
        at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:784)
        at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:444)
        at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:434)
        at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$3.run(ServerRuntime.java:934)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
        at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:966)
        at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:922)
        at com.company.rest.Contents.lambda$getContent$3(Contents.java:146)
        ...

What we're doing in "Contents.java:146" is something like:

	EntityTag entityTag = getEntityTag(content);
	ResponseBuilder responseBuilder = request.evaluatePreconditions(entityTag);
	if (responseBuilder == null) {
		// cached resource did change -> serve updated content
		responseBuilder = createResponseBuilder(range, content).tag(entityTag);
	}
	asyncResponse.resume(responseBuilder.build());

This "content" thing is practically an InputStream - i.e. with this request a blob is returned. The funny thing is that this happens ALWAYS when returning "tiff" images. Not sure if this matters of course or if the problem isn't actually in something from our side. But anyways, this should not blow up with a NPE. :)

Please tell me if you need any more information.
Comment 1 Mark Thomas 2017-09-15 20:00:36 UTC
We'll need a simple test case to reproduce the problem for this one.

From the stack trace it looks like the async task is retaining a reference to the OutputStream after the associated AsyncContext has completed. There could be various root causes of that in both Tomcat and the app - hence the request for a test case.
Comment 2 Boris Petrov 2017-09-18 15:19:50 UTC
Hello, I'm trying to reproduce it but that proves to be difficult.

A few things that I noticed while trying stuff: as I said, this happens when we return "tiff" images as blobs. It happens when I open a simple page which has only a single <img> tag in it with the link to the tiff image. It blows up when I open this page in Chrome and Firefox. Doesn't blow up in Safari. Doesn't blow up when I hit the link to the image with curl and wget. The interesting thing is that Safari CAN display tiff images while Chrome and Firefox can't. Also, in the network tabs, it seems that the responses are smaller than the file is itself, which I guess means they didn't receive the full file. Is it possible that Chrome and Firefox are closing the connection before they receive the whole image, knowing that they can't actually display it, and that causes the NPE?

I can actually debug in Eclipse the NPE (it stops there when it happens) so if I can help you somehow debug it via me... :) I could tell you values of variables, etc.

I'll continue to try and create a reproduction but let's try to figure it out without that.

Thanks.
Comment 3 Boris Petrov 2017-09-21 15:02:42 UTC
A little bit more info - I've updated Tomcat to 8.5.21 and Jersey to 2.26. The exception happens again. The following occurs:

1) I hard-refresh the page where a tiff image is shown in an <img> tag. Nothing happens.

2) I soft-refresh the page - the exception happens then. What is interesting is the request that Chrome has made - it contains a "Range:bytes=519677-519677" header. Our guess is that Chrome has cached the URL and "knows" this is an image and needs more bytes or something. We respond with "content-length:1, content-range:bytes 519677-519677/27672280" which causes the exception I guess.

This is the MessageBodyWriter we use:

@Override
public void writeTo(ContentRangeProxy t, Class<?> type, Type genericType, Annotation[] annotations, MediaType mediaType,
		MultivaluedMap<String, Object> httpHeaders, OutputStream entityStream) throws IOException {

	try (InputStream inputStream = t.content.getStream()) {
		if (t.rangeHeader != null) {
			try (InputStream cutInputStream = t.rangeHeader.cutStream(inputStream, t.content.getSize().getValue())) {
				ByteStreams.copy(cutInputStream, entityStream);
			}
		} else {
			ByteStreams.copy(inputStream, entityStream);
		}
	} catch (ClientAbortException exception) {
		// this is to handle the case when the HTTP connection is closed and the entityStream becomes unwritable
	}
}

And the cutStream method is implemented like so:

public InputStream cutStream(InputStream stream, Long contentLength) throws IOException {
	if (from != -1) {
		ByteStreams.skipFully(stream, from);
	}
	return to == -1 ?
		stream :
		ByteStreams.limit(stream, contentLength == null ? ((to - from) + 1) : getContentLength(contentLength));
}

I hope this helps a bit more.
Comment 4 Boris Petrov 2017-10-02 13:24:44 UTC
The issue appears also in Tomcat 9.0.1. The stacktrace:

java.lang.NullPointerException: null
    at org.apache.coyote.http11.Http11OutputBuffer.commit(Http11OutputBuffer.java:327)
    at org.apache.coyote.http11.Http11Processor.prepareResponse(Http11Processor.java:1054)
    at org.apache.coyote.AbstractProcessor.action(AbstractProcessor.java:253)
    at org.apache.coyote.Response.action(Response.java:178)
    at org.apache.coyote.Response.sendHeaders(Response.java:376)
    at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:280)
    at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:261)
    at org.apache.catalina.connector.CoyoteOutputStream.flush(CoyoteOutputStream.java:118)
    at org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.flush(ResponseWriter.java:330)
    at org.glassfish.jersey.message.internal.CommittingOutputStream.flush(CommittingOutputStream.java:287)
    at org.glassfish.jersey.message.internal.OutboundMessageContext.close(OutboundMessageContext.java:847)
    at org.glassfish.jersey.server.ContainerResponse.close(ContainerResponse.java:413)
    at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:734)
    at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:395)
    at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:385)
    at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$3.run(ServerRuntime.java:884)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:316)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:298)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:268)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:289)
    at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:916)
    at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:872)
    at com.company.rest.Contents.lambda$getContent$3(Contents.java:146)
Comment 5 Mark Thomas 2017-11-08 09:51:02 UTC
We still need a (simple) test case for this.

My analysis from comment #1 still applies.
Comment 6 Boris Petrov 2017-11-09 10:01:48 UTC
I spent quite some time trying to reproduce it but couldn't manage to do it in a simple test case. :( I really cannot help any more for now. If you prefer, you can close the issue and at some point, if I manage to create some reproduction, I can create a new one. Or you just leave this one open as a reminder.

Thanks for the time!
Comment 7 Konstantin Kolinko 2017-11-10 14:47:19 UTC
Set the following system property (if you have not set it yet):

org.apache.catalina.connector.RECYCLE_FACADES=true

See the last section in https://wiki.apache.org/tomcat/FAQ/Troubleshooting_and_Diagnostics
Comment 8 Mark Thomas 2018-05-30 07:58:11 UTC
No reproduction after 6 months so closing as works for me. If you are able to reproduce this then feel free to re-open and attach the steps to reproduce from a clean install.