Bug 43327 - Socket bind fails on tomcat startup when using apr
Socket bind fails on tomcat startup when using apr
Status: RESOLVED FIXED
Product: Tomcat Native
Classification: Unclassified
Component: Library
unspecified
HP Linux
: P2 normal (vote)
: ---
Assigned To: Tomcat Developers Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2007-09-07 07:22 UTC by Erik Moller
Modified: 2014-04-15 20:12 UTC (History)
3 users (show)



Attachments
script to demonstrate problem (667 bytes, application/x-sh)
2008-05-29 22:58 UTC, David Pashley
Details
logfile of build using the script (85.79 KB, application/octet-stream)
2008-05-29 22:59 UTC, David Pashley
Details
Native patch (1.19 KB, patch)
2008-07-05 15:46 UTC, Mark Thomas
Details | Diff
APR Java connector patch (785 bytes, patch)
2008-07-05 15:47 UTC, Mark Thomas
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Erik Moller 2007-09-07 07:22:31 UTC
I'm running debian sid, kernel 2.6.22 and use tomcat 6.0.14 together with apr
1.2.11. no special tomcat configuration (out of the box).
On start up of tomcat I get the following error log:

Debugging apr revealed a problem with IPV6:
apr_socket_create() assigns a AF_INET6 socket,
apr_socket_bind tries to bind a AF_INET address which fails.

INFO: Loaded Apache Tomcat Native library 1.1.10.
Sep 7, 2007 6:05:16 PM org.apache.catalina.core.AprLifecycleListener init
INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false],
random [true].
Sep 7, 2007 6:05:16 PM org.apache.coyote.http11.Http11AprProtocol init
SEVERE: Error initializing endpoint
java.lang.Exception: Socket bind failed: [22] Invalid argument
    at org.apache.tomcat.util.net.AprEndpoint.init(AprEndpoint.java:612)
    at org.apache.coyote.http11.Http11AprProtocol.init(Http11AprProtocol.java:121)
    at org.apache.catalina.connector.Connector.initialize(Connector.java:1059)
  
regards
erik
Comment 1 Mark Thomas 2008-04-30 15:39:53 UTC
I am having trouble reproducing this error. It may have already been fixed or it may be an issue with your build environment.

Please can you re-test with the latest APR, openssl, tomcat-native and 6.0.16 and, if you still see this error, provide the exact version numbers and commands you used to build so I can try and repeat this.
Comment 2 David Pashley 2008-05-29 08:43:02 UTC
I also get this problem.

Using: 

 * Tomcat 6.0.16
 * APR 1.2.11-1 from Ubuntu Hardy
 * OpenSSL 0.9.8g-4ubuntu3.1 from Hardy
 * tcnative 1.1.12

I'm attempting to package this for debian and ubuntu. You can fetch the build from my git repo at svn://femme.catnip.org.uk/tomcat6.git
Comment 3 David Pashley 2008-05-29 08:44:02 UTC
That should be git://femme.catnip.org.uk/tomcat6.git
Comment 4 David Pashley 2008-05-29 08:45:42 UTC
Suitable snippet from the strace log:

9749  socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 30
9749  setsockopt(30, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
9749  setsockopt(30, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
9749  bind(30, {sa_family=AF_INET, sin_port=htons(8180), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINVAL (Invalid argument)
Comment 5 Mark Thomas 2008-05-29 16:29:24 UTC
I still can't reproduce this. I wonder if the some local changes in the Ubuntu packages are involved.

Can you provide step by step instructions to reproduce this using the distros from an Apache mirror rather than the Ubuntu versions. Also, what is the IP config of the machine you are using?
Comment 6 David Pashley 2008-05-29 22:12:40 UTC
I can attempt to build using the tarballs from upstream. Do you have IPv6 enabled on your box, as that appears to be the problem? If you disable IPv6 there isn't a problem. If you enable it, it attempts to create an IPv6 socket and then attempts to bind to an IPv4 address of '0.0.0.0'. 

I've checked the packaging of APR in hardy and there are no patches that would affect sockets. There is nothing in the tomcat packaging. 
Comment 7 David Pashley 2008-05-29 22:57:01 UTC
beebo david% ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:09:6b:42:b2:61 brd ff:ff:ff:ff:ff:ff
3: wifi0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ieee802.11 00:04:e2:63:83:ee brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
    link/ether 00:04:e2:63:83:ee brd ff:ff:ff:ff:ff:ff
    inet 10.1.0.6/24 brd 10.1.0.255 scope global eth1
    inet6 fe80::204:e2ff:fe63:83ee/64 scope link 
       valid_lft forever preferred_lft forever

I've attached a script I've used to replicate this bug and the build log. I'm not entirely sure why configure spits out errors. From a cursory glance at the code, it appears to iterate through the environment attempting to eval them all. It doesn't appear to result in any functional problems:

30-May-2008 05:52:43 org.apache.catalina.core.AprLifecycleListener init
INFO: The APR based Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: /usr/lib/jvm/java-6-sun-1.6.0.06/jre/lib/i386/client:/usr/lib/jvm/java-6-sun-1.6.0.06/jre/lib/i386:/usr/lib/jvm/java-6-sun-1.6.0.06/jre/../lib/i386:/usr/java/packages/lib/i386:/lib:/usr/lib
30-May-2008 05:52:43 org.apache.coyote.http11.Http11Protocol init
INFO: Initializing Coyote HTTP/1.1 on http-8080
30-May-2008 05:52:43 org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 1963 ms
30-May-2008 05:52:44 org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
30-May-2008 05:52:44 org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/6.0.16
30-May-2008 05:52:45 org.apache.catalina.core.StandardContext addApplicationListener
INFO: The listener "listeners.ContextListener" is already configured for this context. The duplicate definition has been ignored.
30-May-2008 05:52:45 org.apache.catalina.core.StandardContext addApplicationListener
INFO: The listener "listeners.SessionListener" is already configured for this context. The duplicate definition has been ignored.
30-May-2008 05:52:46 org.apache.coyote.http11.Http11Protocol start
INFO: Starting Coyote HTTP/1.1 on http-8080
30-May-2008 05:52:46 org.apache.jk.common.ChannelSocket init
INFO: JK: ajp13 listening on /0.0.0.0:8009
30-May-2008 05:52:46 org.apache.jk.server.JkMain start
INFO: Jk running ID=0 time=0/204  config=null
30-May-2008 05:52:46 org.apache.catalina.startup.Catalina start
INFO: Server startup in 2795 ms
30-May-2008 05:52:52 org.apache.coyote.http11.Http11Protocol pause
INFO: Pausing Coyote HTTP/1.1 on http-8080
30-May-2008 05:52:53 org.apache.catalina.core.StandardService stop
INFO: Stopping service Catalina
30-May-2008 05:52:53 org.apache.coyote.http11.Http11Protocol destroy
INFO: Stopping Coyote HTTP/1.1 on http-8080
30-May-2008 05:52:56 org.apache.catalina.core.AprLifecycleListener init
INFO: Loaded APR based Apache Tomcat Native library 1.1.12.
30-May-2008 05:52:56 org.apache.catalina.core.AprLifecycleListener init
INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
30-May-2008 05:53:04 org.apache.catalina.startup.Catalina stopServer
SEVERE: Catalina.stop: 
java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:519)
	at java.net.Socket.connect(Socket.java:469)
	at java.net.Socket.<init>(Socket.java:366)
	at java.net.Socket.<init>(Socket.java:180)
	at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:421)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:337)
	at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:415)
Comment 8 David Pashley 2008-05-29 22:58:04 UTC
Created attachment 22036 [details]
script to demonstrate problem
Comment 9 David Pashley 2008-05-29 22:59:53 UTC
Created attachment 22037 [details]
logfile of build using the script
Comment 10 Siegmund Gorr 2008-06-02 23:02:54 UTC
Hi, I could add another target environment with same problems, except that no SSL is being used.

Linux openSUSE-103-64-minimal 2.6.22.9-0.4-default #1 SMP 2007/10/05 21:32:04 UTC x86_64 x86_64 x86_64 GNU/Linux
Tomcat 6.0.14

It is pretty frigthening that a server help tool like jsvc prevents this lib from working. It looks that the design and integration of jsvc or APR has some side effects. I could not recommend any furhter production use of this combination.
Comment 11 Mark Thomas 2008-06-03 00:06:13 UTC
David,
IPv6 is enabled. I was using tcnative 1.1.13. It might be worth testing with that. Also, could you provide the contents of /etc/network/interfaces and the connector elements of your server.xml. Finally, are you starting with jsvc? Meanwhile, I'll double check I have IPv6 enabled.

Siegmund,
Where does jsvc fit into this? Are you saying you only see this issue when starting via jsvc but not when starting from the scripts?
Comment 12 David Pashley 2008-06-03 00:42:48 UTC
I did a diff from what is in tomcat6-6.0.16/bin/tcnative.tar.gz and tcnative-1.1.13 and the only differences were svn commit info in the comments, so there is no code difference what so ever.

I am running via jsvc:
root      3966  0.0  0.1   2056   364 ?        Ss   May30   0:00 /usr/bin/jsvc -user tomcat6 -cp /usr/share/java/commons-daemon.jar:/usr/share/tomcat6/bin/bootstrap.jar -outfile /var/lib/tomcat6/logs/catalina.out -errfile /usr/share/tomcat6/logs/catalina.err -pidfile /var/run/tomcat6.pid -Djava.awt.headless=true -Xmx128M -Djava.endorsed.dirs=/usr/share/tomcat6/common/endorsed -Dcatalina.base=/var/lib/tomcat6 -Dcatalina.home=/usr/share/tomcat6 -Djava.io.tmpdir=/var/lib/tomcat6/temp -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=/var/lib/tomcat6/conf/logging.properties org.apache.catalina.startup.Bootstrap
tomcat6   3974  0.0 10.4 282728 27484 ?        Sl   May30   2:40 /usr/bin/jsvc -user tomcat6 -cp /usr/share/java/commons-daemon.jar:/usr/share/tomcat6/bin/bootstrap.jar -outfile /var/lib/tomcat6/logs/catalina.out -errfile /usr/share/tomcat6/logs/catalina.err -pidfile /var/run/tomcat6.pid -Djava.awt.headless=true -Xmx128M -Djava.endorsed.dirs=/usr/share/tomcat6/common/endorsed -Dcatalina.base=/var/lib/tomcat6 -Dcatalina.home=/usr/share/tomcat6 -Djava.io.tmpdir=/var/lib/tomcat6/temp -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=/var/lib/tomcat6/conf/logging.properties org.apache.catalina.startup.Bootstrap


my interfaces file is:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet dhcp
# post-up ethtool -K eth0 tx off

#
# The commented out line above will disable TCP checksumming which
# might resolve problems for some users.  It is disabled by default
#

results in:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether aa:00:0a:00:02:3b brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.59/16 brd 10.0.255.255 scope global eth0
    inet6 fe80::a800:aff:fe00:23b/64 scope link 
       valid_lft forever preferred_lft forever



My only enabled connectors are:

    <!-- Define a non-SSL HTTP/1.1 Connector on port 8180 -->
    <Connector port="8180" maxHttpHeaderSize="8192"
               maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
               enableLookups="false" redirectPort="8443" acceptCount="100"
               connectionTimeout="20000" disableUploadTimeout="true" />
    <!-- Define an AJP 1.3 Connector on port 8009 -->
    <Connector port="8009" 
               enableLookups="false" redirectPort="8443" protocol="AJP/1.3" />
Comment 13 Siegmund Gorr 2008-06-03 12:55:11 UTC
Update on JSVC and APR

Involving debugging I found that my problem with JSVC and APR could be drilled down to SSL library access I did not found the reason why it hangs - may be the privileges are not sufficient after switching from root to the www-user. After a time period (on my three servers approx. 6 min. the startup suddenly continues - like a timeout has been reached.

My problems could be resolved by deactivating ssl in the tomcat apr configuration which might be a hint for other server operators as well. 
We are handling the ssl separat and do not pass the data via the Apache mod_jk module to the AJP handler (which is been boosted by APR).
So configuration in httpd.conf is

JkOptions +ForwardkeySize +ForwardURICompat -ForwardDirectories
JkExtractSSL off
JkRequestLogFormat "%w %V %T"

And whithin server.xml
<Listener className="org.apache.catalina.core.AprLifecycleListener"
  SSLEngine="off" />
 
Sorry but I did not find the reason why this problems is not existing when we start tomcat without jsvc.

Comment 14 Mark Thomas 2008-07-04 14:06:16 UTC
Bingo! I can now reproduce this. Hopefully a fix won't be too far away...
Comment 15 Mark Thomas 2008-07-05 15:40:46 UTC
OK. There are two issues here.

The first is that the APR calls eventually called as a result of Address.info() and Socket.create() in o.a.t.util.net.AprEndpoint use slightly different logic to determine whether to create an IPv4 socket or an IPv6 socket. If you have an IPv6 capable interface that does not have an explicit IPv6 address defined these functions will assume different IP families.

The second issue is that whilst the first issue could be fixed by using Address.getInfo().family to specify that the family of the address should be used to use to create the socket, Address.getInfo() does not convert the APR_INET and APR_INET6 constants back into the ones defined in Socket (the constants are different in the Java code and in APR).

So, I have two patches. The first converts the constants back to the Java defined values and the second uses the family from the address to create the socket. Patches to follow.
Comment 16 Mark Thomas 2008-07-05 15:46:22 UTC
Created attachment 22216 [details]
Native patch

*Health Warning*

I am no C programmer. This patch fixes the problem for me but it needs to be checked by someone who actually knows what they are doing.
Comment 17 Mark Thomas 2008-07-05 15:47:22 UTC
Created attachment 22217 [details]
APR Java connector patch

This needs the native patch to work.
Comment 18 Remy Maucherat 2008-08-19 06:56:16 UTC
I (finally) tested it on my development machine which has both a NAT IPv4 and an IPv6 address, and it does not appear to work any better.

No address attribute -> error 22
address="127.0.0.1"-> error 22
address="myIPv6"-> no error (and ot correctly)
Comment 19 Remy Maucherat 2008-08-19 10:26:45 UTC
Ok, so I was testing the old version :(

The new testing result is:
- no address attribute -> slow bind (20s at least), and does not bind the IPv6
- address="127.0.0.1"-> works
- address="myIPv6"-> works

So it's a lot better, but not glitch free yet.
Comment 20 William A. Rowe Jr. 2008-08-19 10:57:47 UTC
Is this true on Linux, Remy?  Or another OS.  Solaris, for example, is known
to not tolerate sharing an ipv4+6 listener very well.

Essentially, any OS in which the ipv4 stack is built of different driver
components than the ipv6 stack gives headaches attempting a single listener.
For all of these cases, two listening sockets, one explicitly IPv4 and the
other IPv6 is preferable.
Comment 21 Remy Maucherat 2008-08-19 11:03:58 UTC
I am testing on Linux.
Comment 22 Remy Maucherat 2008-08-22 01:46:43 UTC
With the latest changes, this now works :) Really tricky to get right ...
Comment 23 jfclere 2008-08-27 00:47:34 UTC
Fixed in tc-native and trunk proposed for 6.0.x
Comment 24 Andrea F M 2010-09-17 00:37:16 UTC
how I can add the patch