I have a sample python app running as docker container and configured with apache2.4 and openssl (python-server.py): This is a good-response: ``` curl -ik --cert '<your cert>' https://<app-host-name>/good --resolve <app-host-name>:443:<ip> HTTP/1.1 404 Not Found Date: Fri, 14 Apr 2023 08:06:14 GMT Server: Apache/2.4.53 (Unix) OpenSSL/3.0.7+ Transfer-Encoding: chunked This is the good page.% ``` This is a bad-response: ``` curl -ik --cert '<your cert>' https://<app-host-name>/bad --resolve <app-host-name>:443:<ip> HTTP/1.1 200 OK Date: Fri, 14 Apr 2023 08:18:02 GMT Server: Apache/2.4.53 (Unix) OpenSSL/3.0.7+ Transfer-Encoding: chunked HTTP/1.0 b'404 Not Found' This is the bad page.% ``` This is malformed response since it contains b' in the response body.
What's wrong with b' in a response body?
Apache should return whatever response it gets from the app side. Because of the presence of the b' in the response body apache is not able to understand or parse that and by-default its returning 200, It should have actually returned 404 not found since the actual response from app is 404. Note the "b" in front of '404 Not Found'. The "b" is python's syntax for binary data. However, the binary data representation should not bleed through into the HTTP response.
(In reply to shashank from comment #2) > Apache should return whatever response it gets from the app side. Because of > the presence of the b' in the response body apache is not able to understand > or parse that and by-default its returning 200, It should have actually > returned 404 not found since the actual response from app is 404. > Note the "b" in front of '404 Not Found'. The "b" is python's syntax for > binary data. However, the binary data representation should not bleed > through into the HTTP response. Apache doesn't parse the body. Please capture the raw bytes of the complete backend response and attach it here.
This is the info related to app: https://github.com/hmshashank/malformed-http This is the related Discussion we opened in github: https://github.com/apache/airflow/issues/29167 This is the only info i could capture from the logs for both /good and /bad request: cat malformed-http_apache2_ssl_request.log 192.168.8.1 - - [13/Jun/2023:17:51:14 +0000] "GET /good HTTP/1.1" 404 22 "-" "curl/7.71.1-DEV" "-" 745 192.168.8.1 - - [13/Jun/2023:17:51:23 +0000] "GET /bad HTTP/1.1" 200 48 "-" "curl/7.71.1-DEV" "-" 807
When the status line is invalid, Apache interprets it as an http/0.9 response which means the response is the body. It should probably not accept these by default.
I guess we need to have an option to deny HTTP/0.9 responses. I am not sure if we could disable them by default for compatibility reasons.