If an HTTP request is split into two packets and, due to network congestion (or anything else), the second packet arrives significantly later than the first (5 or more seconds), we have a situation where Apache tries to send data down a closed connection. This happens because Apache first parses the HTTP header, then opens a connection to the backend server (without sending any data), then parses the HTTP body, and then tries to send everything at once to the backend server. When the delay between HTTP header and HTTP body exceeds the timeout of the backend server (5 seconds in my case), the backend server closes the connection. When Apache sends the data to the backend server, it gets a reset from the app server, which results in a 502 error being sent back to the web browser. Per the truss output, below, you can see that the header is processed, the connection is opened, and then the body is parsed. After that, the data is sent to the backend server (unsuccessfully). Either mod_proxy should wait to open the connection to the backend server until after it's processed the body, or Apache should at least check the state of the connection before sending data. Or maybe if it gets a reset, it tries again. ** NOTE INCOMING POST FROM CLIEN 2030/24: 0.0006 read(61, " P O S T /<OMITTED>".., 8000) = 128 2030/24: 0.0006 pollsys(0xFFFFFFFF6B7FB6A8, 1, 0xFFFFFFFF6B7FB5E0, 0x00000000) = 1 2030/24: fd=61 ev=POLLIN rev=POLLIN 2030/24: timeout: 300.000000000 sec 2030/24: 0.0005 read(61, " H o s t : <OMITTED>".., 8000) = 177 2030/24: 0.6335 pollsys(0xFFFFFFFF6B7FB6A8, 1, 0xFFFFFFFF6B7FB5E0, 0x00000000) = 1 2030/24: fd=61 ev=POLLIN rev=POLLIN 2030/24: timeout: 300.000000000 sec 2030/24: 0.0005 read(61, "\r\n", 8000) = 2 2030/24: 0.0117 open("/dev/udp", O_RDONLY) = 62 2030/24: 0.0005 ioctl(62, SIOCGLIFNUM, 0xFFFFFFFF6B7FB174) = 0 2030/24: 0.0003 close(62) = 0 2030/24: 0.0007 open("/etc/default/nss", O_RDONLY|O_LARGEFILE) = 62 2030/24: 0.0004 fcntl(62, F_DUPFD, 0x00000100) = 256 2030/24: 0.0002 close(62) = 0 2030/24: 0.0002 read(256, " # i d e n t\t " @ ( #".., 1024) = 1024 2030/24: 0.0002 read(256, " y t h o s e\n # f u n".., 1024) = 211 2030/24: 0.0004 read(256, 0x10071B5C4, 1024) = 0 2030/24: 0.0002 close(256) = 0 2030/24: 0.0004 so_socket(PF_INET, SOCK_STREAM, IPPROTO_TCP, "", SOV_DEFAULT) = 62 2030/24: 0.0003 setsockopt(62, tcp, TCP_NODELAY, 0xFFFFFFFF6B7FB66C, 4, SOV_DEFAULT) = 0 ** NOW APACHE OPENS A CONNECTION TO THE BACKEND SERVER 2030/24: 0.0003 fcntl(62, F_GETFL) = 2 2030/24: 0.0001 fcntl(62, F_SETFL, FWRITE|FNONBLOCK) = 0 2030/24: 0.0005 connect(62, 0x100310CA8, 16, SOV_DEFAULT) Err#150 EINPROGRESS 2030/24: AF_INET name = 10.10.5.10 port = 7929 2030/24: 0.0067 pollsys(0xFFFFFFFF6B7FB5A8, 1, 0xFFFFFFFF6B7FB4E0, 0x00000000) = 1 2030/24: fd=62 ev=POLLOUT rev=POLLOUT 2030/24: timeout: 300.000000000 sec 2030/24: 0.0003 getsockopt(62, SOL_SOCKET, SO_ERROR, 0xFFFFFFFF6B7FB668, 0xFFFFFFFF6B7FB66C, SOV_DEFAULT) = 0 2030/24: 0.0004 getsockname(62, 0x100310E48, 0x100310E28, SOV_DEFAULT) = 0 2030/24: AF_INET name = 10.10.4.10 port = 44706 2030/24: pollsys(0xFFFFFFFF6B7FB338, 1, 0xFFFFFFFF6B7FB270, 0x00000000) (sleeping...) 2030/24: fd=61 ev=POLLIN rev=0xFFFFB49C 2030/24: timeout: 300.000000000 sec ** NOTE THE DELAY IN RECEIVING THE BODY: 14.49 SECONDS 2030/24: 14.4911 pollsys(0xFFFFFFFF6B7FB338, 1, 0xFFFFFFFF6B7FB270, 0x00000000) = 1 2030/24: fd=61 ev=POLLIN rev=POLLIN 2030/24: timeout: 300.000000000 sec ** HERE'S THE BODY 2030/24: 0.0005 read(61, " j _ u s e r n a m e = ".., 8000) = 52 2030/24: 0.0009 writev(62, 0xFFFFFFFF6B7FB650, 14) = 480 ** HERE'S WHERE APACHE TRIES TO WRITE TO THE BACKEND CONNECTION WHICH ** HAS ALREADY EXPIRED AND CLOSED 2030/24: iov_base = 0x1006FA5C0 iov_len = 38 2030/24: P O S T /<SOME PATH. DELETED FOR THIS EXAMPLE> ** THE BACKEND SERVER SENDS US A RESET 2030/24: 0.0009 read(62, 0x10071B0C8, 8000) Err#131 ECONNRESET ** A PROXY ERROR IS THROWN BACK TO THE WEB BROWSER/CLIENT 2030/24: 0.0007 write(28, 0xFFFFFFFF6B7F7680, 203) = 203 2030/24: [ T h u O c t 1 1 1 4 : 3 3 : 0 3 2 0 0 7 ] [ e r r o 2030/24: r ] [ c l i e n t 1 0 . 1 0 . 4 . 1 0 ] ( 1 3 1 ) C o n n 2030/24: e c t i o n r e s e t b y p e e r : p r o x y : e r r 2030/24: o r r e a d i n g s t a t u s l i n e f r o m r e m o 2030/24: t e s e r v e r XX , r e f e r e r : SOMEURL\n 2030/24: 0.0006 write(28, 0xFFFFFFFF6B7F75A0, 184) = 184 2030/24: [ T h u O c t 1 1 1 4 : 3 3 : 0 3 2 0 0 7 ] [ e r r o 2030/24: r ] [ c l i e n t 1 0 . 1 0 . 4 . 1 0 ] p r o x y : E r 2030/24: r o r r e a d i n g f r o m r e m o t e s e r v e r r 2030/24: e t u r n e d b y A_URL, r e f e r e r : SOME_URL\n 2030/24: 0.0004 close(62) = 0 2030/24: 0.0006 brk(0x10071FD60) = 0 2030/24: 0.0002 brk(0x100723D60) = 0 2030/24: 0.0006 writev(61, 0xFFFFFFFF6B7FB930, 2) = 576 2030/24: iov_base = 0x10071D0D8 iov_len = 154 2030/24: H T T P / 1 . 1 5 0 2 P r o x y E r r o r\r\n D a t e : 2030/24: T h u , 1 1 O c t 2 0 0 7 2 1 : 3 2 : 4 7 G M T\r\n V 2030/24: a r y : A c c e p t - E n c o d i n g\r\n C o n t e n t - L e 2030/24: n g t h : 4 2 2\r\n C o n t e n t - T y p e : t e x t / h t 2030/24: m l ; c h a r s e t = i s o - 8 8 5 9 - 1\r\n\r\n 2030/24: iov_base = 0x10071B0C8 iov_len = 422 2030/24: < ! D O C T Y P E H T M L P U B L I C " - / / I E T F / / 2030/24: D T D H T M L 2 . 0 / / E N " >\n < h t m l > < h e a d >\n 2030/24: < t i t l e > 5 0 2 P r o x y E r r o r < / t i t l e >\n < 2030/24: / h e a d > < b o d y >\n < h 1 > P r o x y E r r o r < / h 1 2030/24: >\n < p > T h e p r o x y s e r v e r r e c e i v e d a 2030/24: n i n v a l i d\r\n r e s p o n s e f r o m a n u p s t 2030/24: r e a m s e r v e r . < b r / >\r\n T h e p r o x y s e 2030/24: r v e r c o u l d n o t h a n d l e t h e r e q u e s 2030/24: t < e m > < a h r e f = "A_URL" > P O S T & n b s p ; URL< / a > < / e m > . < p >\n R e a s o n : < s t 2030/24: r o n g > E r r o r r e a d i n g f r o m r e m o t e s 2030/24: e r v e r < / s t r o n g > < / p > < / p >\n < / b o d y > < / 2030/24: h t m l >\n
Created attachment 21565 [details] quick&dirty hack to try patching it Can you test-drive the attached patch (against 2.2.8) with SetEnv fix43607 1 in your proxy's scope? (Don't try this on a server that matters. This patch will want extensive cleaning if it works, and scrapping if it doesn't)
marking NEEDINFO pending feedback
This was occurring in a production environment, so testing the patch will be a problem. I will have to try to create a test environment and then duplicate the packet loss/retransmission scenario that first brought this to our attention. That may not happen for a while.
OK, reverting to NEW, to maximise the chances of it getting attention from anyone (including you or me) who might take an interest.
There are several recent proxy fixes that could affect this, including a race condition that got fixed in 2.2.10 (pr#37770). Does this still exist in 2.2.11?
I can confirm that this bug still exists in 2.2.11. We proxy requests to Jetty and under certain conditions, Jetty reduces the socket timeout for incoming connections to 2 seconds. This means if a client (such as Apache) connects and does not send it any data within 2 seconds, it will close the socket. By splitting an HTTP request into multiple packets (first packet contains HTTP headers and part of the POST body, second packet contains the rest of POST body), and making sure there is more than 2 seconds delay between the two packets, Apache/mod_prozy will always return a 502 response (and the request is never sent to Jetty).
Please help us to refine our list of open and current defects; this is a mass update of old and inactive Bugzilla reports which reflect user error, already resolved defects, and still-existing defects in httpd. As repeatedly announced, the Apache HTTP Server Project has discontinued all development and patch review of the 2.2.x series of releases. The final release 2.2.34 was published in July 2017, and no further evaluation of bug reports or security risks will be considered or published for 2.2.x releases. All reports older than 2.4.x have been updated to status RESOLVED/LATER; no further action is expected unless the report still applies to a current version of httpd. If your report represented a question or confusion about how to use an httpd feature, an unexpected server behavior, problems building or installing httpd, or working with an external component (a third party module, browser etc.) we ask you to start by bringing your question to the User Support and Discussion mailing list, see [https://httpd.apache.org/lists.html#http-users] for details. Include a link to this Bugzilla report for completeness with your question. If your report was clearly a defect in httpd or a feature request, we ask that you retest using a modern httpd release (2.4.33 or later) released in the past year. If it can be reproduced, please reopen this bug and change the Version field above to the httpd version you have reconfirmed with. Your help in identifying defects or enhancements still applicable to the current httpd server software release is greatly appreciated.