|Summary:||Option to strip extra data after websocket request headers|
|Product:||Apache httpd-2||Reporter:||Micha Lenk <micha>|
|Component:||mod_proxy_wstunnel||Assignee:||Apache HTTPD Bugs Mailing List <bugs>|
|Attachments:||Add option to strip extra data after websocket request headers|
Description Micha Lenk 2015-01-12 14:03:36 UTC
Created attachment 32365 [details] Add option to strip extra data after websocket request headers I am using mod_proxy_wstunnel to use Apache httpd as a reverse proxy to tunnel websocket connections from the client to a given websocket server. Problem description (What am I trying to solve?) ************************************************ I observed a client that sends some extra data (i.e. an additional CRLF) after the request headers in the request used to establish the websocket connection. Without Apache inbetween, the communication between client and backend looks like this: CLIENT BACKEND | ------ [ HTTP request ] ------> | | incl. extra* CRLF | | | | <----- [ HTTP response ] ------- | | "HTTP/1.1 101 Switching Protocols" | | | | ------ [ 1st websocket frame ] ----> | | | * Please note that the client sends an extra CRLF after the mandatory empty line used to terminate the request headers. Using Apache httpd as reverse proxy the communication between client and backend looks like this: CLIENT Apache BACKEND | ---- [ HTTP request ] -----> | | | incl. extra* CRLF | | | [A] | | | ------ [ HTTP request ] ---------> | | | without extra CRLF | | | | | | <------- [ HTTP response ] --------- | | | "HTTP/1.1 101 Switching Protocols" | | | | | <--- [ HTTP response ] ------ | | | 101 Switching Protocols | | | [B] | | | -------- [ extra CRLF ] -----------> | | | received from client before [A] | | ---- [ 1st websocket frame ] ---> | [C] | | ----- [ 1st websocket frame ] -----> | | | | The extra CRLF sent by the client is actually not needed as per HTTP specs. So, Apache httpd is totally right in just parsing in the HTTP request without that extra CRLF in [A]. But as the client did already send the extra CRLF, it lingers around in the input filter chain. Eventually mod_proxy_wstunnel enters a tunneling mode, i.e. forwards all data seen on the client connection to the backend connection and vice versa. This results in the backend response "HTTP/1.1 101 Switching Protocols" being forwarded to the client, and [B] the extra CRLF being forwarded to the backend. Next the client sends a websocket frame, which also gets forwarded to the backend. From the backends point of view the key difference is, that without Apache the extra CRLF is received before switching to websocket protocol. The backend server is lucky to read request headers sloppy enough to swallow the extra CRLF, so everything works just fine. Whereas if Apache is configured as reverse proxy, the extra CRLF is received after switching to websocket protocol. The outcome is a confused websocket parser on the backend server and a broken websocket connection. Fix approach ************ The proposed patch introduces a new subprocess environment variable "proxy-wstunnel-strip-extra-data" that causes mod_proxy_wstunnel to strip any extra data received on the client socket after receiving the request headers and before connecting to the backend server.