Bug 60681 - Increase severity of "request failed: error reading the headers"
Summary: Increase severity of "request failed: error reading the headers"
Status: NEW
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: Core (show other bugs)
Version: 2.4-HEAD
Hardware: PC All
: P2 major (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-02-02 16:52 UTC by Ben RUBSON
Modified: 2023-05-17 01:59 UTC (History)
0 users



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ben RUBSON 2017-02-02 16:52:19 UTC
Hello,

in server/protocol.c :
ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(00567)
              "request failed: error reading the headers");

Would be nice if this message could be APLOG_ERR (as in Debian package), or at least APLOG_NOTICE / APLOG_WARNING.

We would then easily have these messages in the error log, without having to set LogLevel to debug.
Goal is then to ban (with fail2ban) hosts generating too frequently these messages. For example, a slowloris attacker will generate such messages.

Thank you very much !

Ben
Comment 1 Ben RUBSON 2017-02-09 15:56:31 UTC
Patch submitted :
https://github.com/apache/httpd/pull/25
Thank you !
Comment 2 Eric Covener 2017-02-09 16:11:53 UTC
I don't think we should just flip the severity back, need a way to opt-in or if the default changes again, an easy way to suppress messages by prefix/ID.
Comment 3 Ben RUBSON 2017-02-09 18:11:56 UTC
Thank you for your feedback Eric.

> I don't think we should just flip the severity back,
Do you mean that this error was already set to an error level different than DEBUG in the past ?
Comment 4 Eric Covener 2017-02-09 18:34:28 UTC
(In reply to Ben RUBSON from comment #3)
> Thank you for your feedback Eric.
> 
> > I don't think we should just flip the severity back,
> Do you mean that this error was already set to an error level different than
> DEBUG in the past ?

That was my recollection -- just confirmed it was error level in 2.0.x.
Comment 5 Ben RUBSON 2017-02-09 18:47:05 UTC
OK, thank you for your confirmation :)

From my point of view, re-increasing at least this one would be justified, as it allows to detect slowloris attacks.
Apache 2.2 was released in 2005, slowloris attack invented in 2009.
So in 2005 I think it surely sounded good for this error to be set to DEBUG level.
But today I think we might think about raising it again.

I open this bug "only" today, as for now I was using the Debian package, in which this message is set to ERROR level.
Using the (non-modified) compiled package on another system made me raise this case.

Thank you !
Comment 6 Jacob Champion 2017-02-09 18:53:15 UTC
(In reply to Eric Covener from comment #2)
> I don't think we should just flip the severity back, need a way to opt-in or
> if the default changes again, an easy way to suppress messages by prefix/ID.

Hmm, something like a `#pragma diagnostic` would be nice...
Comment 7 William A. Rowe Jr. 2017-02-09 21:38:23 UTC
Flipping this back is the wrong course of action.

The purpose of the error log is to alert and advise the operator or content
maintainers of the server of those conditions under their control which can
be fixed to improve delivery of content. This can include not-found requests
(which might reflect now-missing content or bad redirects between content),
failing misdirected back-end proxy requests, server load fault conditions
and other problem cases that demand the operator's attention.

No malformed request can be addressed by the server operator, these represent
an error by the client/user agent which need to be addressed by the user.
In some cases the user is the operator, which is why all of the malformed
requests can be more carefully examined at core loglevel debug.

The specific feature request noted by reporter is that some administrators 
would like to scrape log files for invalid requests, in order to add those 
remote ip addresses to lower-level ban lists. (There is no efficiency to
adding those ip addresses to any application-level httpd ban module because 
these errors are dispatched almost immediately before the request processing
phases begin.)

This request is already possible today with no change to httpd by scraping
the access log for any 400 responses. All 400 responses indicate a malformed
request, whether this is a malformed request line or header line, or some
invalid header values.

Other client-originated errors, such as not-found, auth required, etc will
all fall into an error code >400 <500.

For all httpd operators who were not using the error log to effect some
ban list, changing the severity will simply generate unactionable, and
therefore wasteful log file bytes.

I'd vote WONTFIX.
Comment 8 William A. Rowe Jr. 2017-02-09 23:01:11 UTC
Actually, let me reframe that. Leaving a breadcrumb of a faulty request
which does *not* hit the access.log file is sensible.

We do need a patch which alters the severity (to APLOG_NOTICE, imo) only of
these no-access-record error cases. It should not exceed the severity of
our other historical timeout errors. Based on other failures during the
header line processing, I believe the patch we are looking for is an actual
400-error response in the case of timeout reading header lines, just as we
have an immediate 400-error response in the case of too many header lines
and header line length limit exceeded. This would be captured in the access
log.

Historically, there is no breadcrumb for connecting and waiting for timeout
force disconnect, and there was no such error record in 2.2.31.

Based on the historical pattern of ignoring any HTTP:
  connect - no request - timeout - loop
vs [error] level report of slowloris 
  connect - trickle request - timeout (or rather, max lines etc) - loop

HTTP Socket Connect:
no record

HTTP Timeout before a header line is read:
no record

HTTP Timeout after header line is read:
[Thu Feb 09 16:16:31 2017] [error] [client 127.0.0.1] request failed: error reading the headers

HTTP Header limits exceeded (slowloris pattern):
Redundant recording, both error and access log entries;
[Thu Feb 09 16:55:27 2017] [error] [client 127.0.0.1] request failed: error reading the headers
127.0.0.1 - - [09/Feb/2017:16:52:20 -0600] "GET / HTTP/1.1" 400 290

HTTP Timeout after header lines complete, before C-L request body:
no error log record, access log records:
127.0.0.1 - - [09/Feb/2017:16:32:20 -0600] "GET / HTTP/1.1" 400 226

HTTP Timeout after header lines complete, before T-E:chunked header:
no error log record, access log records:
127.0.0.1 - - [09/Feb/2017:16:41:22 -0600] "GET / HTTP/1.1" 400 226

HTTP Timeout after header lines complete, before T-E:chunked content:
no error log record, access log records:
127.0.0.1 - - [09/Feb/2017:16:41:22 -0600] "GET / HTTP/1.1" 400 226

HTTP Timeout on keepalive:
no record


HTTPS Socket Connect:
[Thu Feb 09 16:21:44 2017] [info] [client 127.0.0.1] Connection to child 192 established (server hub.wrowe.net:443)
[Thu Feb 09 16:21:44 2017] [info] Seeding PRNG with 656 bytes of entropy

HTTPS Handshake Timeout:
[Thu Feb 09 16:21:59 2017] [info] [client 127.0.0.1] (70007)The timeout specified has expired: SSL handshake interrupted by system [Hint: Stop button pressed in browser?!]
[Thu Feb 09 16:21:59 2017] [info] [client 127.0.0.1] Connection closed to child 192 with abortive shutdown (server hub.wrowe.net:443)

HTTPS Handshake Completed:
[Thu Feb 09 16:24:53 2017] [info] Connection: Client IP: 127.0.0.1, Protocol: TLSv1.2, Cipher: ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)

HTTPS Timeout after handshake, before header line is read:
[Thu Feb 09 16:25:08 2017] [info] [client 127.0.0.1] (70007)The timeout specified has expired: SSL input filter read failed.
[Thu Feb 09 16:25:08 2017] [info] [client 127.0.0.1] Connection closed to child 2 with standard shutdown (server hub.wrowe.net:443)

HTTPS Timeout after handshake, after header line is read:
[Thu Feb 09 16:27:02 2017] [info] [client 127.0.0.1] (70007)The timeout specified has expired: SSL input filter read failed.
[Thu Feb 09 16:27:02 2017] [error] [client 127.0.0.1] request failed: error reading the headers
[Thu Feb 09 16:27:02 2017] [info] [client 127.0.0.1] Connection closed to child 128 with standard shutdown (server hub.wrowe.net:443)

HTTPS Keepalive Timeout after initial request:
[Thu Feb 09 16:28:21 2017] [info] Initial (No.1) HTTPS request received for child 68 (server hub.wrowe.net:443)
[Thu Feb 09 16:28:26 2017] [info] [client 127.0.0.1] (70007)The timeout specified has expired: SSL input filter read failed.
[Thu Feb 09 16:28:26 2017] [info] [client 127.0.0.1] Connection closed to child 68 with standard shutdown (server hub.wrowe.net:443)
Comment 9 William A. Rowe Jr. 2017-02-09 23:03:46 UTC
Sorry for confusion, I should have stated 'request line', not 'header line':

HTTP Timeout before a request line is read:
no record

HTTP Timeout after request line is read, before first header line:
[Thu Feb 09 16:16:31 2017] [error] [client 127.0.0.1] request failed: error reading the headers
Comment 10 William A. Rowe Jr. 2017-02-10 00:30:52 UTC
Sorry for confusion, I should have stated 'request line', not 'header line':

HTTP Connect:
no record

HTTP Timeout before the request line is read:
no record

HTTP Timeout after request line is read, before all header lines are read:
[Thu Feb 09 16:16:31 2017] [error] [client 127.0.0.1] request failed: error reading the headers


With mod_reqtimeout loaded, these events are changed;

HTTP reqtimeout Connect:
no record

HTTP reqtimeout before the request line is read:
Redundant recording, both error (level [info]) and access log entries;
[Thu Feb 09 17:11:26 2017] [info] [client 127.0.0.1] Request header read timeout
127.0.0.1 - - [09/Feb/2017:17:11:26 -0600] "-" 408 -
[No 408 error message is transmitted to the client]

HTTP reqtimeout after request line is read, before all header lines are read:
Redundant recording, both error (level [error]!) and access log entries;
[Thu Feb 09 17:13:54 2017] [info] [client 127.0.0.1] Request header read timeout
[Thu Feb 09 17:13:54 2017] [error] [client 127.0.0.1] request failed: error reading the headers
127.0.0.1 - - [09/Feb/2017:17:13:38 -0600] "GET / HTTP/1.1" 408 223

HTTP reqtimeout after header lines complete, before C-L request body:
Redundant recording, both error and access log entries;
[Thu Feb 09 18:22:55 2017] [info] [client 127.0.0.1] Request body read timeout
127.0.0.1 - - [09/Feb/2017:18:22:34 -0600] "GET / HTTP/1.1" 400 226

HTTP reqtimeout after header lines complete, before T-E:chunked header:
Redundant recording, both error and access log entries;
[Thu Feb 09 18:19:41 2017] [info] [client 127.0.0.1] Request body read timeout
127.0.0.1 - - [09/Feb/2017:18:19:20 -0600] "HEAD / HTTP/1.1" 400 -

HTTP reqtimeout after header lines complete, before T-E:chunked content:
Redundant recording, both error and access log entries;
127.0.0.1 - - [09/Feb/2017:18:26:37 -0600] "GET / HTTP/1.1" 400 226
[Thu Feb 09 18:26:47 2017] [info] [client 127.0.0.1] Request body read timeout

HTTP reqtimeout on keepalive:
no record


Here again, an [error] level was too loud and redundant. The [info] level
messages are in-line with the historical mod_ssl [info] reporting.

This suggests to me that the [info] alerts should be emitted from the core
non-reqtimeout read failure paths, and that the explicit 408, transmitted
or not, should be the core reaction to timeout before the request has been
read and during the body read timeout. The [error] alert for the one and
only one failure path once a 408 is recorded is incorrect. Timeouts reading
the body should also be 408.

The fail2ban filter should be applied for all 408 responses, be that the
first request line timeout, all timeouts of header line timeouts and all 
header body timeouts, along with all 400 responses, which records all
"illegible" and "too slow" cases which may likely indicate an abuse pattern.
Some reasonable threshold needs to be applied, so that the remote cellular
or other user with intermittent connectivity does not trip this filter too
easily. Given that any web request may consist of as many as 2-4 parallel
http connections to gather the primary request and tangential resources
(gif/jpg/css/js etc), some limit such as 10 400+408 requests combined would 
appear to indicate abusive traffic.
Comment 11 Ben RUBSON 2017-02-10 08:10:38 UTC
William, thank you very much for your deep technical analysis !

> We do need a patch which alters the severity (to APLOG_NOTICE, imo)
I would also be glad to have [notice] level messages in the ErrorLog for these failure cases.

Fail2ban already has an "apache-overflows" jail, which reads the ErrorLog.
It would then really be convenient (and logical) to add to this existing jail a new rule to catch "request failed: error reading the headers" (and other faulty timeout) messages.
This would avoid adding a new jail, reading a new file (the AccessLog), which is also much bigger to compute than the ErrorLog.

IMO, [notice] level would be the right one to use, as it is still acceptable on a production environment. [info] & [debug] levels are really too verbose.

Thank you very much !
Comment 12 William A. Rowe Jr. 2017-02-15 14:52:33 UTC
"Fail2ban already has an "apache-overflows" jail, which reads the ErrorLog.
It would then really be convenient (and logical) to add to this existing jail a
new rule to catch "request failed: error reading the headers" (and other faulty
timeout) messages.
This would avoid adding a new jail, reading a new file (the AccessLog), which
is also much bigger to compute than the ErrorLog."

Be aware that there are already a number of 400/408 exceptions that will only
appear in the access log today by default at LogLevel Info, and that appears
to be true of the earlier revisions. So the suggestion to ignore the access 
log isn't really viable. I am still in the process of cross-tabulating all of
the previous and current behaviors.

I also made a substantial mistake about mod_reqtimeout; using telnet - the
terminal type handshake occurred - using nc or direct socket access and writing
no bytes, mod_reqtimeout does not log an error in the access log. Only after
some bytes are read or written will it log this faux-error access entry with
no error response to the client.
Comment 13 William A. Rowe Jr. 2017-02-15 14:52:33 UTC
"Fail2ban already has an "apache-overflows" jail, which reads the ErrorLog.
It would then really be convenient (and logical) to add to this existing jail a
new rule to catch "request failed: error reading the headers" (and other faulty
timeout) messages.
This would avoid adding a new jail, reading a new file (the AccessLog), which
is also much bigger to compute than the ErrorLog."

Be aware that there are already a number of 400/408 exceptions that will only
appear in the access log today by default at LogLevel Info, and that appears
to be true of the earlier revisions. So the suggestion to ignore the access 
log isn't really viable. I am still in the process of cross-tabulating all of
the previous and current behaviors.

I also made a substantial mistake about mod_reqtimeout; using telnet - the
terminal type handshake occurred - using nc or direct socket access and writing
no bytes, mod_reqtimeout does not log an error in the access log. Only after
some bytes are read or written will it log this faux-error access entry with
no error response to the client.
Comment 14 Ben RUBSON 2018-01-16 10:12:55 UTC
Hi, Eric, Jacob, William, and others :)
Any progress on this ?
Thank you again !
Ben