Bug 57265 - Tomcat 8 hiden behind NGINX fails to send file when using NIO connector
Tomcat 8 hiden behind NGINX fails to send file when using NIO connector
Status: RESOLVED FIXED
Product: Tomcat 8
Classification: Unclassified
Component: Connectors
8.0.15
PC Linux
: P2 major (vote)
: ----
Assigned To: Tomcat Developers Mailing List
:
: 58011 (view as bug list)
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2014-11-26 11:24 UTC by Jaroslav Kamenik
Modified: 2015-09-23 12:37 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jaroslav Kamenik 2014-11-26 11:24:30 UTC
We have moved Tomcat 8 server behind the nginx balancing server and have started  experiencing this problem:

org.apache.tomcat.util.net.NioEndpoint$NioBufferHandler@2001a157
26-Nov-2014 11:37:04.476 SEVERE [http-nio-8443-ClientPoller-0] org.apache.tomcat.util.net.NioEndpoint$Poller.processSendfile 
 java.lang.IllegalArgumentException: You can only read using the application read buffer provided by the handler.
at org.apache.tomcat.util.net.SecureNioChannel.write(SecureNioChannel.java:489)
at sun.nio.ch.FileChannelImpl.transferToArbitraryChannel(FileChannelImpl.java:534)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:583)
at org.apache.tomcat.util.net.NioEndpoint$Poller.processSendfile(NioEndpoint.java:1200)
at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1122)
at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1087)
at java.lang.Thread.run(Thread.java:745)


Problem occurres irregularly when loading lots of scripts refencenced by homepage.
It seems to be ok with useSendfile=false. I have tried to add some slow logging (with flushing output) to code and it lowers occurrence rate, so it looks like some race condition problem.
Comment 1 Mark Thomas 2014-12-03 18:45:36 UTC
Just to confirm my reading of the stacktrace, you are using Sendfile with SSL, right?

Is that is the case, are you sure using Sendfile is providing any benefit?
Comment 2 Mark Thomas 2014-12-04 13:34:10 UTC
Hard to see how this could happen. A possible fix is:
https://github.com/markt-asf/tomcat/commit/9b18a9f97b88572b34f88c456b04d519b8480827
Comment 3 Mark Thomas 2014-12-05 11:57:10 UTC
Looking at this again.

Making the sendFile flag volatile isn't going to help since it is always correctly set to the true in the same thread where it is read and found to be false (which triggers the stack trace below).
Comment 4 Mark Thomas 2014-12-05 13:36:42 UTC
I've applied a possible fix to trunk, 8.0.x (for 8.0.16 onwards) and 7.0.x (for 7.0.58 onwards).

If this doesn't fix the issue, please feel free to re-open this and provide an updated stack trace.
Comment 5 Josh Wojcik 2015-02-20 19:29:04 UTC
This bug still appears using the NIO connector and SSL.  Using Non-SSL, the bug is fixed.

We are using tomcat version 7.0.59.

Can you apply your fix to SSL?
Comment 6 Mark Thomas 2015-06-09 09:04:31 UTC
*** Bug 58011 has been marked as a duplicate of this bug. ***
Comment 7 Mark Thomas 2015-06-09 09:05:20 UTC
The previous fix applied equally to HTTP and HTTPS. I'll do another code review.
Comment 8 Mark Thomas 2015-06-09 09:57:48 UTC
I've looked at this several times now. The only explanation I can come up with is that a previous update of the sendFile flag in a previous processing thread is delayed and applied at just the wrong moment which breaks the current processing thread. Making sendfile volatile should therefore fix this.

I have applied this fix to trunk (9.0.x), 8.0.x (for 8.0.24 onwards) and 7.0.x (for 7.0.63 onwards). If you see this issue after updating to one of the above releases (or later) please re-open and provide an updated stack trace.
Comment 9 Steve Clark 2015-09-11 13:57:20 UTC
I've am experiencing this problem in Linux with Tomcat 8.0.26 using Sendfile with SSH.

Here is the stacktrace:

11-Sep-2015 13:16:46.377 SEVERE [http-nio-443-ClientPoller-0] org.apache.tomcat.util.net.NioEndpoint$Poller.processSendfile 
 java.lang.IllegalArgumentException: You can only read using the application read buffer provided by the handler.
	at org.apache.tomcat.util.net.SecureNioChannel.write(SecureNioChannel.java:498)
	at sun.nio.ch.FileChannelImpl.transferToArbitraryChannel(FileChannelImpl.java:524)
	at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:573)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.processSendfile(NioEndpoint.java:1193)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:1122)
	at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1087)
	at java.lang.Thread.run(Thread.java:744)

I am using the following version of Java:

java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)

Please let me know if I can provide you with any more information.
Comment 10 Mark Thomas 2015-09-11 20:05:45 UTC
A few questions:
- How easily can you reproduce this?
- Do you have a test case you can share?
- Are you are able to apply a patch to Tomcat to test a possible fix?
- Are you able to test with a custom Tomcat build we provide to test a possible fix?
Comment 11 Steve Clark 2015-09-16 15:57:17 UTC
(In reply to Mark Thomas from comment #10)
> A few questions:
> - How easily can you reproduce this?
> - Do you have a test case you can share?
> - Are you are able to apply a patch to Tomcat to test a possible fix?
> - Are you able to test with a custom Tomcat build we provide to test a
> possible fix?

- It is an intermittent problem, so we can't reproduce it at will.
- Unfortunately not.  It is happening as part of our commercial web application that is using ExtJS.  This means that there are some large Javascript files to download when loading the web page
- Yes, I can apply a patch to Tomcat to test
- Yes, We can test with a custom Tomcat build

Do you think this problem could be caused by a lack of memory?  Our software is running in a VM and there can sometimes be issues hitting a memory ceiling.  I was wondering if this could cause the problem
Comment 12 Steve Clark 2015-09-16 16:12:20 UTC
(In reply to Steve Clark from comment #11)
> (In reply to Mark Thomas from comment #10)
> > A few questions:
> > - How easily can you reproduce this?
> > - Do you have a test case you can share?
> > - Are you are able to apply a patch to Tomcat to test a possible fix?
> > - Are you able to test with a custom Tomcat build we provide to test a
> > possible fix?
> 
> - It is an intermittent problem, so we can't reproduce it at will.
> - Unfortunately not.  It is happening as part of our commercial web
> application that is using ExtJS.  This means that there are some large
> Javascript files to download when loading the web page
> - Yes, I can apply a patch to Tomcat to test
> - Yes, We can test with a custom Tomcat build
> 
> Do you think this problem could be caused by a lack of memory?  Our software
> is running in a VM and there can sometimes be issues hitting a memory
> ceiling.  I was wondering if this could cause the problem

I've just talked to some members of the team and when the issue occurs on their machines they have plenty of free memory so I don't think that this issue is related to a lack of free memory.
Comment 13 Remy Maucherat 2015-09-16 16:47:10 UTC
I think JF ran into this issue with his SSL testing, so he can test. IMO the most likely is a concurrent channel close (it would flip the sendFile flag), could a simple connection error cause that ?

I don't understand the setting of the sendFile flag to false in a processSendfile finally (the flag could remain until the end of the sendfile instead). This is not going to fix the root cause, but can I make that change in trunk since it would simplify ?

Also, couldn't the code preventing writing arbitrary buffers be removed ? IMO it's not particularly useful and the sendFile flag could be removed.
Comment 14 Mark Thomas 2015-09-21 14:40:38 UTC
I've back-ported one on Remy's fixes to 8.0.x. Please could you try the 8.0.27-dev build available from http://people.apache.org/~markt/dev/v8.0.27-dev/ and let us know if that fixes the issues or not.

Note: This is a dev / snapshot build for testing only. It is NOT an official release.
Comment 15 Mark Thomas 2015-09-21 15:08:59 UTC
We really need a test case that reproduces this otherwise we aren't doing much more than guesing what the problem might be.
Comment 16 Remy Maucherat 2015-09-22 07:56:30 UTC
According to Jean-Frédéric testing, the exception disappeared for him (no ab errors, and performance is rather good with the poller count setting, so all ok apparently). However it would need a sync to be clearly fully eliminated, which is clearly not worth it ! I would prefer removing the sendfile flag from the channel, actively checking the buffer used is not particularly useful IMO.
Comment 17 Mark Thomas 2015-09-22 08:13:11 UTC
+1
Comment 18 Remy Maucherat 2015-09-22 08:51:03 UTC
Cleanup done and ported in 7.0.65 and 8.0.27, so it is supposed to be fixed.
Comment 19 Steve Clark 2015-09-23 12:06:49 UTC
(In reply to Remy Maucherat from comment #18)
> Cleanup done and ported in 7.0.65 and 8.0.27, so it is supposed to be fixed.

Thanks.  I have tried with version 8.0.27 that I built from the trunk today and have not been able to replicate the issue so far (I guess this is because the code no longer throws the exception I was seeing!).

Is there an ETA for the 8.0.27 release build?
Comment 20 Mark Thomas 2015-09-23 12:37:07 UTC
I'm running through the unit tests now making sure they pass on all platforms. Once a get a clean run on all three (Windows, OSX, Linux) I'll be tagging 8.0.27 (assuming no new bugs show up between now and when I get a clean run(.