Bug 57955 - Be able to control the TCP keepalive idle time
Summary: Be able to control the TCP keepalive idle time
Status: NEW
Alias: None
Product: APR
Classification: Unclassified
Component: APR (show other bugs)
Version: 1.4.8
Hardware: All All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: Apache Portable Runtime bugs mailinglist
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-27 13:59 UTC by Diego Santa Cruz
Modified: 2015-05-28 07:59 UTC (History)
0 users



Attachments
Implementation of APR_TCP_KEEPALIVE_IDLE (6.68 KB, text/plain)
2015-05-27 13:59 UTC, Diego Santa Cruz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Diego Santa Cruz 2015-05-27 13:59:49 UTC
Created attachment 32759 [details]
Implementation of APR_TCP_KEEPALIVE_IDLE

APR currently allows to enable keepalive on sockets via APR_SO_KEEPALIVE.

However, there is no way to control the idle time for TCP.

We have implemented a patch that adds an extra option (APR_TCP_KEEPALIVE_IDLE) to control the keepalive idle time (in seconds). Setting it to a non-zero value implicitly enables APR_SO_KEEPALIVE and setting it to zero implicitly disables APR_SO_KEEPALIVE.

For simplicity's sake the interval time is always set to 1 second.

The patch is against APR 1.4.8, but I hope it applies as is to APR 1.5.2 and trunk. This implementation has been tested on Linux and Win32.

Is this something that could be accepted?
Comment 1 Jeff Trawick 2015-05-27 14:57:21 UTC
Generally speaking, this sounds like an acceptable feature, though I haven't checked for portability of the concept beyond what you mentioned.  We need to give people a chance to make sure it is working on their platform, and it can't be in an APR release until 1.6 or 2.0 anyway.  (No new features within a particular stable branch.)
Comment 2 Diego Santa Cruz 2015-05-28 07:59:04 UTC
Unfortunately the TCP keepalive settings are quite a bit platform specific, even among unices.

On Linux the necessary ioctl's exist since 2.4 and there are 3 tunables (see http://linux.die.net/man/7/tcp):

 - the idle time (TCP_KEEPIDLE): time the connection needs to be idle to start sending keep-alive probes, 2 hours by default

 - the time interval (TCP_KEEPINTVL): the interval between successive probes after the idle time, default is 75 seconds

 - the count (TCP_KEEPCNT): the number of probes to send before concluding the connection is down, default is 9


On Windows the SIO_KEEPALIVE_VALS IOCTL is supported on Windows 2000 and later. There are only 2 tunables (see https://msdn.microsoft.com/en-us/library/windows/desktop/dd877220%28v=vs.85%29.aspx).

 - the idle time (default is 2 hours) and

 - the interval (default is 1 second).

The count is fixed to 10 in Vista and later and is settable system-wide on previous Windows versions via the registry.

I done a bit of research for other OSes and these are my findings (information is not that easy to gather).

For Solaris I've found the info at http://docs.oracle.com/cd/E23824_01/html/821-1475/tcp-7p.html (see also https://docs.oracle.com/cd/E19120-01/open.solaris/819-2724/fsvdg/index.html and https://docs.oracle.com/cd/E19120-01/open.solaris/819-2724/fsvdh/index.html). There are two tunables:

 - TCP_KEEPALIVE_THRESHOLD: this is the same as Linux' idle time, but in milliseconds. It defaults to 2 hours.

 - TCP_KEEPALIVE_ABORT_THRESHOLD: this is the same as Linux' TCP_KEEPINTVL multiplied by TCP_KEEPCNT, but in milliseconds. It defaults to 8 minutes.

FreeBSD uses the same tunables as Linux, with the same names (see https://www.freebsd.org/cgi/man.cgi?query=tcp&sektion=4&apropos=0&manpath=FreeBSD+10.1-RELEASE), so the patch should work for FreeBSD as well.

For MacOS X there is only one tunable TCP_KEEPALIVE that sets the idle time (see https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man4/tcp.4.html).

Note that in my patch the value used for the interval is set to 1 second on Linux and Windows, following Window's default; which at least in our application is a better match that the longer default timeout of Linux and Solaris.

It seems most OSes that implement TCP keepalive have at least the notion of the idle time (when they implement per-socket settings), which is in line with RFC1122. Other options are "implementation details".

The patch has been extensively tested on Linux 2.6 and Windows (XP and later) and has been in production in our products for quite some time. The patch should work for FreeBSD as it uses the same socket options as Linux (but I do not have a FreeBSD machine to test it).

For MacOS X and Solaris the patch should be extended. I do not have any Mac OS X or Solaris machines to test that, not even compile it.