We have a reverse proxy from Apache to a Tomcat instance. Apache is configured like this: # Proxy tomcat SetEnv proxy-initial-not-pooled ProxyRequests Off ProxyPreserveHost On # Set timeout higher than Vaadin's "heartbeatInterval" ProxyTimeout 300 # Proxy "/pcom" through to tomcat (Vaadin application) ProxyPass /pcom/ http://127.0.0.1:8080/pcom/ retry=0 ProxyPassReverse /pcom/ http://127.0.0.1:8080/pcom/ #ProxyPassReverseCookieDomain 127.0.0.1:8080 SERVERNAME #ProxyPassReverseCookiePath /pcom /pcom # Proxy "/VAADIN" through to tomcat (Vaadin application) ProxyPass /VAADIN/ http://127.0.0.1:8080/VAADIN/ retry=0 ProxyPassReverse /VAADIN/ http://127.0.0.1:8080/VAADIN/ # Proxy "/static" through to tomcat ProxyPass /static/ http://127.0.0.1:8080/static/ retry=0 ProxyPassReverse /static/ http://127.0.0.1:8080/static/ # Proxy "/api" through to tomcat (REST API) ProxyPass /api/ http://127.0.0.1:8080/ws/api/ retry=0 ProxyPassReverse /api/ http://127.0.0.1:8080/ws/api/ # Make websocket requests work properly for the above mappings as well RewriteEngine On RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteCond %{HTTP:Connection} upgrade [NC] RewriteRule ^/pcom/ "ws://127.0.0.1:8080%{REQUEST_URI}" [P] RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteCond %{HTTP:Connection} upgrade [NC] RewriteRule ^/VAADIN/ "ws://127.0.0.1:8080%{REQUEST_URI}" [P] RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteCond %{HTTP:Connection} upgrade [NC] RewriteRule ^/static/ "ws://127.0.0.1:8080%{REQUEST_URI}" [P] RewriteCond %{HTTP:Upgrade} websocket [NC] RewriteCond %{HTTP:Connection} upgrade [NC] RewriteRule ^/api/ "ws://127.0.0.1:8080%{REQUEST_URI}" [P] Note that ProxyTimeout is set to 5 minutes because the client web application only polls for new data every ~3 minutes. This has worked for a long time and most recently worked fine with Apache version 2.4.43. After upgrading to 2.4.51, we started getting disconnects in the web application. So something was broken in a version in the range (2.4.43, 2.4.51]. After turning up the logging level, it became apparent that the proxy timeout being used in 2.4.51 was actually 60 seconds instead of 5 minutes, even though our config has not changed. You can see it happen here. Note the line "proxy: websocket: tunnel running (timeout 60.000000)" at time 16:28:11.114445, then the timeout occurring 60 seconds after "AH10213: proxy: websocket: polling timed out (client=31, origin=31)": [Tue Feb 15 16:28:11.111553 2022] [proxy_wstunnel:trace5] [pid 8825] mod_proxy_wstunnel.c(340): [client 127.0.0.1:47232] handler fallback [Tue Feb 15 16:28:11.111555 2022] [proxy_http:trace1] [pid 8825] mod_proxy_http.c(1875): [client 127.0.0.1:47232] HTTP: serving URL ws://127.0.0.1:8080/pcom/PUSH?v-uiId=4&v-pushId=7f29d183-3c6a-4c8c-b672-001261fc162f&X-Atmosphere-tracking-id=0&X-Atmosphere-Framework=2.2.13.vaadin5-javascript&X-Atmosphere-Transport=websocket&X-Atmosphere-TrackMessageSize=true&Content-Type=application/json;%20charset=UTF-8&X-atmo-protocol=true [Tue Feb 15 16:28:11.111561 2022] [proxy:debug] [pid 8825] proxy_util.c(2588): [client 127.0.0.1:47232] AH00944: connecting ws://127.0.0.1:8080/pcom/PUSH?v-uiId=4&v-pushId=7f29d183-3c6a-4c8c-b672-001261fc162f&X-Atmosphere-tracking-id=0&X-Atmosphere-Framework=2.2.13.vaadin5-javascript&X-Atmosphere-Transport=websocket&X-Atmosphere-TrackMessageSize=true&Content-Type=application/json;%20charset=UTF-8&X-atmo-protocol=true to 127.0.0.1:8080 [Tue Feb 15 16:28:11.111580 2022] [proxy:debug] [pid 8825] proxy_util.c(2811): [client 127.0.0.1:47232] AH00947: connected /pcom/PUSH?v-uiId=4&v-pushId=7f29d183-3c6a-4c8c-b672-001261fc162f&X-Atmosphere-tracking-id=0&X-Atmosphere-Framework=2.2.13.vaadin5-javascript&X-Atmosphere-Transport=websocket&X-Atmosphere-TrackMessageSize=true&Content-Type=application/json;%20charset=UTF-8&X-atmo-protocol=true to 127.0.0.1:8080 [Tue Feb 15 16:28:11.114299 2022] [proxy_http:trace3] [pid 8825] mod_proxy_http.c(1198): [client 127.0.0.1:47232] Status from backend: 101 [Tue Feb 15 16:28:11.114321 2022] [proxy_http:trace4] [pid 8825] mod_proxy_http.c(820): [client 127.0.0.1:47232] Headers received from backend: [Tue Feb 15 16:28:11.114325 2022] [proxy_http:trace4] [pid 8825] mod_proxy_http.c(842): [client 127.0.0.1:47232] Upgrade: websocket [Tue Feb 15 16:28:11.114331 2022] [proxy_http:trace4] [pid 8825] mod_proxy_http.c(842): [client 127.0.0.1:47232] Connection: upgrade [Tue Feb 15 16:28:11.114335 2022] [proxy_http:trace4] [pid 8825] mod_proxy_http.c(842): [client 127.0.0.1:47232] Sec-WebSocket-Accept: BBZpglr2TI0qkORtYXDeNDu3PN4= [Tue Feb 15 16:28:11.114338 2022] [proxy_http:trace4] [pid 8825] mod_proxy_http.c(842): [client 127.0.0.1:47232] Sec-WebSocket-Extensions: permessage-deflate [Tue Feb 15 16:28:11.114341 2022] [proxy_http:trace4] [pid 8825] mod_proxy_http.c(842): [client 127.0.0.1:47232] Date: Tue, 15 Feb 2022 22:28:10 GMT [Tue Feb 15 16:28:11.114352 2022] [proxy_http:trace2] [pid 8825] mod_proxy_http.c(1376): [client 127.0.0.1:47232] HTTP: received interim 101 response (policy: n/a) [Tue Feb 15 16:28:11.114367 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(856): [client 127.0.0.1:47232] ssl_filter_write: 193 bytes [Tue Feb 15 16:28:11.114392 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(215): [client 127.0.0.1:47232] bio_filter_out_write: 215 bytes [Tue Feb 15 16:28:11.114403 2022] [ssl:trace4] [pid 8825] ssl_engine_io.c(2370): [client 127.0.0.1:47232] OpenSSL: write 215/215 bytes to BIO#56310e553f60 [mem: 56310e56aaa3] [Tue Feb 15 16:28:11.114424 2022] [core:trace6] [pid 8825] core_filters.c(812): [client 127.0.0.1:47232] writev_nonblocking: 215/215 [Tue Feb 15 16:28:11.114433 2022] [proxy_http:debug] [pid 8825] mod_proxy_http.c(1480): [client 127.0.0.1:47232] AH10239: HTTP: tunneling protocol websocket [Tue Feb 15 16:28:11.114445 2022] [proxy:trace1] [pid 8825] proxy_util.c(4874): [client 127.0.0.1:47232] AH10212: proxy: websocket: tunnel running (timeout 60.000000) [Tue Feb 15 16:28:11.114451 2022] [proxy:trace5] [pid 8825] proxy_util.c(4980): [client 127.0.0.1:47232] proxy: websocket: client resume writable [Tue Feb 15 16:28:11.114458 2022] [proxy:trace2] [pid 8825] proxy_util.c(4653): [client 127.0.0.1:47232] ap_proxy_transfer_between_connections complete (from 127.0.0.1:8080) [Tue Feb 15 16:28:11.114465 2022] [proxy:trace5] [pid 8825] proxy_util.c(4980): [client 127.0.0.1:47232] proxy: websocket: origin resume writable [Tue Feb 15 16:28:11.114474 2022] [ssl:trace4] [pid 8825] ssl_engine_io.c(2379): [client 127.0.0.1:47232] OpenSSL: I/O error, 5 bytes expected to read on BIO#56310e557030 [mem: 56310e570b13] [Tue Feb 15 16:28:11.114478 2022] [proxy:trace2] [pid 8825] proxy_util.c(4653): (11)Resource temporarily unavailable: [client 127.0.0.1:47232] ap_proxy_transfer_between_connections complete (to 127.0.0.1:8080) [Tue Feb 15 16:28:11.115073 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(856): [client 127.0.0.1:47232] ssl_filter_write: 44 bytes [Tue Feb 15 16:28:11.115090 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(215): [client 127.0.0.1:47232] bio_filter_out_write: 66 bytes [Tue Feb 15 16:28:11.115093 2022] [ssl:trace4] [pid 8825] ssl_engine_io.c(2370): [client 127.0.0.1:47232] OpenSSL: write 66/66 bytes to BIO#56310e553f60 [mem: 56310e56aaa3] [Tue Feb 15 16:28:11.115110 2022] [core:trace6] [pid 8825] core_filters.c(812): [client 127.0.0.1:47232] writev_nonblocking: 66/66 [Tue Feb 15 16:28:11.115117 2022] [proxy:trace2] [pid 8825] proxy_util.c(4653): [client 127.0.0.1:47232] ap_proxy_transfer_between_connections complete (from 127.0.0.1:8080) [Tue Feb 15 16:29:11.124429 2022] [proxy:trace2] [pid 8825] proxy_util.c(4896): [client 127.0.0.1:47232] AH10213: proxy: websocket: polling timed out (client=31, origin=31) [Tue Feb 15 16:29:11.124641 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(215): [client 127.0.0.1:47232] bio_filter_out_write: 24 bytes [Tue Feb 15 16:29:11.124647 2022] [ssl:trace4] [pid 8825] ssl_engine_io.c(2370): [client 127.0.0.1:47232] OpenSSL: write 24/24 bytes to BIO#56310e553f60 [mem: 56310e56aaa3] [Tue Feb 15 16:29:11.124650 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(156): [client 127.0.0.1:47232] bio_filter_out_write: flush [Tue Feb 15 16:29:11.124671 2022] [core:trace6] [pid 8825] core_filters.c(812): [client 127.0.0.1:47232] writev_nonblocking: 24/24 [Tue Feb 15 16:29:11.124675 2022] [ssl:trace3] [pid 8825] ssl_engine_kernel.c(2223): [client 127.0.0.1:47232] OpenSSL: Write: SSL negotiation finished successfully [Tue Feb 15 16:29:11.124677 2022] [ssl:trace6] [pid 8825] ssl_engine_io.c(156): [client 127.0.0.1:47232] bio_filter_out_write: flush The web client displays the error "Disconnected from server" exactly 60 seconds after connecting. Where is this 60 seconds value coming from?? We tried adding "timeout=300" to the ProxyPass lines (even though this timeout is supposed to alread default to ProxyTimeout) but that didn't help. So we don't even have a workaround for this bug yet, other than downgrading.
A possibly workaround is "ProxyWebsocketFallbackToProxyHttp off" (https://httpd.apache.org/docs/2.4/mod/mod_proxy_wstunnel.html#proxywebsocketfallbacktoproxyhttp). But since 2.4.52 websocket tunneling uses the smallest of Timeout and ProxyTimeout, you may want to try 2.4.52 (and then add "Timeout 300" to your configuration as you already tried in 2.4.51).
> But since 2.4.52 websocket tunneling uses the smallest of Timeout and > ProxyTimeout, you may want to try 2.4.52 (and then add "Timeout 300" > to your configuration as you already tried in 2.4.51). Wha? Isn't it wrong to make a completely undocumented and backward incompatible change like that in a patch release?
FWIW, I can confirm that adding "Timeout 300" does work around the problem: AH10212: proxy: websocket: tunnel running (timeout 300.000000) But this still seems like a bug to me. Or at least a poor design decision. Why should ProxyTimeout's DEFAULT value (Timeout) ever override its CONFIGURED value (ProxyTimeout) ? I definitely don't get it.
I agree that the need to increase Timeout is not a good solution. Maybe we should change the logic to choose the maximum of both timeouts (front end socket, backend socket) instead of the minimum as backend timeouts can be configured more selectively (per worker if needed) as front end timeouts and typically the backend timeouts reflect the application requirements.
Something like the following patch: Index: modules/proxy/mod_proxy_http.c =================================================================== --- modules/proxy/mod_proxy_http.c (revision 1898054) +++ modules/proxy/mod_proxy_http.c (working copy) @@ -1485,10 +1485,10 @@ return HTTP_INTERNAL_SERVER_ERROR; } - /* Set timeout to the lowest configured for client or backend */ + /* Set timeout to the highest configured for client or backend */ apr_socket_timeout_get(backend->sock, &backend_timeout); apr_socket_timeout_get(ap_get_conn_socket(c), &client_timeout); - if (backend_timeout >= 0 && backend_timeout < client_timeout) { + if (backend_timeout >= 0 && backend_timeout > client_timeout) { tunnel->timeout = backend_timeout; } else { Index: modules/proxy/proxy_util.c =================================================================== --- modules/proxy/proxy_util.c (revision 1898054) +++ modules/proxy/proxy_util.c (working copy) @@ -4718,10 +4718,10 @@ tunnel->origin->pfd->desc.s = ap_get_conn_socket(c_o); tunnel->origin->pfd->client_data = tunnel->origin; - /* Defaults to the smallest timeout of both connections */ + /* Defaults to the biggest timeout of both connections */ apr_socket_timeout_get(tunnel->client->pfd->desc.s, &timeout); apr_socket_timeout_get(tunnel->origin->pfd->desc.s, &tunnel->timeout); - if (timeout >= 0 && (tunnel->timeout < 0 || tunnel->timeout > timeout)) { + if (timeout >= 0 && (tunnel->timeout < 0 || tunnel->timeout < timeout)) { tunnel->timeout = timeout; }
Another option instead of increasing the Timeout is ProxyWebsocketFallbackToProxyHttp off but in this case no timeout is applied to the websocket connection at all and the proxy waits until one side closes the connection.
(In reply to Ruediger Pluem from comment #5) > Something like the following patch: +1
(In reply to Archie Cobbs from comment #3) > > Why should ProxyTimeout's DEFAULT value (Timeout) ever override its > CONFIGURED value (ProxyTimeout) ? I definitely don't get it. The rationale is/was that since Timeout is the one applied on the client connection side, and ProxyTimeout (or ProxyPass timeout= parameter) the one(s) on the backend side, we'd use the smallest when tunneling between the client and backend because if one side times out the tunnel should be terminated. But this was probably a poor choice because in most cases it actually obliges to change existing configuration to set both timeouts at the same/highest value. Poor choices happen.. As Rüdiger said, note that in 2.4.43 no timeout was enforced in mod_proxy for websocket tunneling (neither ProxyTimeout nor Timeout, so the timeout was actually infinite), your configuration worked regardless of ProxyTimeout probably thanks to the heartbeat. The above change was made precisely to honor a timeout while tunneling without an ad hoc mechanism. Let's change this again for 2.4.53.
(In reply to Yann Ylavic from comment #7) > (In reply to Ruediger Pluem from comment #5) > > Something like the following patch: > > +1 r1898127
(In reply to Ruediger Pluem from comment #9) > (In reply to Yann Ylavic from comment #7) > > (In reply to Ruediger Pluem from comment #5) > > > Something like the following patch: > > > > +1 > > r1898127 Thanks! Since the mod_proxy_http hunk of your patch in comment 5 is 2.4 only, how about we simply remove the tunnel->timeout setting there in 2.4 (on backport) given that it's already what ap_proxy_tunnel_create() called just above will do?
Thanks guys, appreciate the fast response. I have two updates and a new question related to this. The updates are: 1 - Adding "ProxyWebsocketFallbackToProxyHttp off" does in fact work around the problem. 2 - Previously I said that setting "Timeout 300" also works around the problem, but that is not entirely accurate. What that does is simply change how long it takes for the particular problem I'm seeing to occur, from 60 seconds to 5 minutes. This is as you would probably expect. Here's my question: > As Rüdiger said, note that in 2.4.43 no timeout was enforced in > mod_proxy for websocket tunneling (neither ProxyTimeout nor Timeout, > so the timeout was actually infinite), your configuration worked > regardless of ProxyTimeout probably thanks to the heartbeat. The > above change was made precisely to honor a timeout while tunneling > without an ad hoc mechanism. OK this makes sense. Basically my web application is using websockets and sending a heartbeat every ~3 minutes. This websocket connection was previously being proxied by mod_proxy to Tomcat, and mod_proxy was not applying any particular timeout to that connection, so it would stay up indefinitely. But here's my question: isn't a websocket connection SUPPOSED to stay up indefinitely? Or at least, shouldn't any "timeout" value that is applied to that connection be an IDLE timeout rather than an ABSOLUTE timeout (measured from when the connection was started)? In other words, it seems to me like whatever mod_proxy was doing is more correct, at least in the scenario of a proxied websocket connection, than what mod_wstunnel is doing. Can you explain to me what I'm missing? Thanks.
(In reply to Archie Cobbs from comment #11) > > Or at least, shouldn't any "timeout" value that is applied to that > connection be an IDLE timeout rather than an ABSOLUTE timeout (measured > from when the connection was started)? It is an idle timeout. Connections with activity stay up.
> > Or at least, shouldn't any "timeout" value that is applied to that > > connection be an IDLE timeout rather than an ABSOLUTE timeout (measured > > from when the connection was started)? > > It is an idle timeout. Connections with activity stay up. Yes, you're right. What Vaadin appears to be doing is creating more than one websocket connection, and sending its heartbeat messages on only one of them. Actually that does make sense - the purpose of the Vaadin heartbeat is to let the server know that the client is still alive, not to juice the websocket connection(s). Philosophically, timeouts are appropriate for a request/response world, but the websockets world is a completely different beast. When doing websockets, it really should be left up to the two endpoints to decide if/when the connection has "timed out" and should be torn down; any intervening proxies should not have any timeout (in general) because (in general) they don't have any special knowledge of what the two endpoint applications are doing with their TCP connection. So I'd say that there is another subtle problem here, which is that there really should be two separate configuration variables for configuring (a) the proxy timeout for regular connections and (b) the proxy timeout for websocket connections.
> So I'd say that there is another subtle problem here, which is that there really > should be two separate configuration variables for configuring (a) the proxy > timeout for regular connections and (b) the proxy timeout for websocket > connections. Any thoughts on this question? Right now the only way to have different timeout values for normal vs. websockets is to say "ProxyWebsocketFallbackToProxyHttp off" which seems like the wrong tool.
(In reply to Archie Cobbs from comment #14) > > So I'd say that there is another subtle problem here, which is that there really > > should be two separate configuration variables for configuring (a) the proxy > > timeout for regular connections and (b) the proxy timeout for websocket > > connections. > > Any thoughts on this question? > > Right now the only way to have different timeout values for normal vs. > websockets > is to say "ProxyWebsocketFallbackToProxyHttp off" which seems like the wrong > tool. You can configure an individual timeout per wss:// backend worker. Setting it to an arbitrary high number should give you what you need.
Proposed for backport as https://github.com/apache/httpd/pull/295
> You can configure an individual timeout per wss:// backend worker. > Setting it to an arbitrary high number should give you what you need. Thanks - I didn't realize that.
(In reply to Ruediger Pluem from comment #16) > Proposed for backport as https://github.com/apache/httpd/pull/295 Backported as r1898559. Will be part of the next release.