Bug 27134 - mod_ldap/util_ldap blindly rebind connection in checkuserid
Summary: mod_ldap/util_ldap blindly rebind connection in checkuserid
Status: CLOSED DUPLICATE of bug 27748
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mod_ldap (show other bugs)
Version: 2.0.48
Hardware: All All
: P3 major with 1 vote (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2004-02-21 14:17 UTC by Denis Gervalle
Modified: 2004-11-16 19:05 UTC (History)
1 user (show)



Attachments
Patch using option one in the solution explained above (1.33 KB, patch)
2004-02-21 14:23 UTC, Denis Gervalle
Details | Diff
Cumulatiive patch that fix the issue of the increasing connections to the LDAP server describe above (3.56 KB, patch)
2004-04-20 22:09 UTC, Denis Gervalle
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Denis Gervalle 2004-02-21 14:17:13 UTC
*** Overview ***
In mod_ldap (util_ldap.c), during util_ldap_cache_checkuserid, ldap connection
may be rebound to the checked user dn and still be known to be bound to its
original dn or anonymous. Reuse of these connections may later lead to
unexpected authentication failures. These problem are particularly annoying with
an ldap server that refuse anonymous connection or in which users has no rights
to read other users entry. This may also has some security problem since the
connection pool may contain bound connection marked as anonymous one.

*** Symptoms ***
These problems are usually reported as no more good authentication after a
failed or even successful previous authentication. Usually the following or a
similar error is logged:

auth_ldap authenticate: user <username> authentication failed; URI <URI> [User
not found][No such object]

*** Test case ***
To easily trigger such problem, configure an ldap server that give access to the
users entry only using an non-anonymous bound connection. Configure users to
have no access to other user entry (you are now in the worst case). use the
mod_auth_ldap with the following configuration on given URI:

    AuthType Basic
    AuthName "LDAP authentication"
    AuthLDAPEnabled on
    AuthLDAPBindDN <dn of the only ldap account that have access to all user entry>
    AuthLDAPBindPassword <password for this account>
    AuthLDAPURL <appropriate ldap url>
    AuthLDAPAuthoritative on

Than try to access this URI using good and wrong authentication. Access to this
URI will quickly became fuzzy and the message above will be reported for valid
user with good and bad password.

*** Explanation ***
Here are the auth_util related steps involved in a mod_auth_ldap authentication:

1) mod_auth_ldap_check_user_id() gets called to check user authentication
2) mod_auth_ldap_check_user_id() retrieve a cached connection from
util_ldap_connection_find()
3) util_ldap_connection_find() search for a connection matching the host, port,
binddn/bindpw required by util_ldap_connection_find() which have taken these
from your httpd configuration
4) if a matching connection is found, it is returned as is, else a non-matching
connection is set to be unbound and is returned
5) mod_auth_ldap_check_user_id() provide the retrieved connection to 
util_ldap_cache_checkuserid() with the username/password provide by the user and
the filter provide in your configuration
6) util_ldap_cache_checkuserid() check the user cache for a previous successful
authentication of this user, if found, no use of the ldap connection is done
7) if none are found, the ldap connection is open using
util_ldap_connection_open() which means bind the connection if it is currently
known to be unbound using the binddn/bindpw previously stored by
util_ldap_connection_find(), and set it to be known bound, else do nothing !
8) search the ldap server for the dn of the user based on the provided filter
9) if one and only one record is return, retrieve the provided dn.
10) rebind the connection to the user dn using the provided user password to
know if the user password is correct. This is done using, a direct call to an
ldap api function called ldap_simple_bind_s(). The known to be bound and binddn
of the util_ldap connection structure use for connection caching are not updated
by this call which lead to the problem.
11) later, on return from  util_ldap_cache_checkuserid,
mod_auth_ldap_check_user_id release as is the connection to the cache using a
misnamed function called util_ldap_connection_close.
12) if an error has been reported and only if this error is LDAP_SERVER_DOWN,
the connection will be unbound (and some retries will be done), avoiding the
problem when no server has answer (funny, no?)

Has you should have understand, in step 10, the connection is rebound to another
user without updating cache information and in step 11, this rebound connection
is released to the cache.

*** Solution ***
To solve this issue, there is two options:
1) Synchronize the cache information of the provided connection with its new
required binding, setting it to unbound and use util_ldap_connection_open() to
rebound the connection properly which ensure correct util_ldap cache usage
2) Retrieve another connection from cache for the user authentication using
util_ldap_connection_find().

Choosing between these option is choosing between keeping the 'search user'
connection bound to the AuthLDAPBindDN opposed to using only one connection for
authentication.

I have seen a patch that use option two, but I am afraid that this patch does
not properly release the first connection to the cache using
util_ldap_connection_close.
Comment 1 Denis Gervalle 2004-02-21 14:23:27 UTC
Created attachment 10470 [details]
Patch using option one in the solution explained above
Comment 2 Denis Gervalle 2004-02-21 14:29:42 UTC
*** Patch ***
For my part, I have choose option one, which is using only one connection for
both search and authentication, leave a connection bound to the authenticated
user on the cache.

The attached unified patch has been done against the latest public released,
which is at the time of this writing version 2.0.48. I have also manually check
 the head of development (version 2.1.x), and it should apply too.

I have no more time right known to test this patch thoroughly. It is in
production on our server since it was written and I will keep this bug report
informed of any further problem we may encounter.
Comment 3 Denis Gervalle 2004-02-21 15:16:56 UTC
There is other discussions on these issues in bug 17599 (which provide a
probably wrong patch using option two describe above) and bug 21787 which
provide a similar solution to this one by always marking the ldap connection
unbound after authentication even if it was bound to the authenticated user.
Comment 4 Denis Gervalle 2004-02-21 15:31:21 UTC
Bug 24683 may seems also related
Comment 5 Albert Lunde 2004-04-19 19:28:37 UTC
We tried to use the patch attached id=10470, and found that while it gave
correct results, the number of open connections to the LDAP server increased
linearly over time. We started hitting limits on the LDAP server on open
connections.

This may be a generic flaw in mod_ldap, in that there is no bounds I can see on
the number of cached connections or how long they may be held open. In this
case, it doesn't appear that the open connections served much of a function.

We ended up using a patch suggested in comments to
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=17274

On Apache 2.0.49 this was:

$ diff mod_auth_ldap.c~  mod_auth_ldap.c
329c329
<     util_ldap_connection_close(ldc);
---
>     util_ldap_connection_destroy(ldc);

But this may defeat connection caching entirely. I don't claim to understand the
code in detail.


Comment 6 Denis Gervalle 2004-04-20 22:09:25 UTC
Created attachment 11296 [details]
Cumulatiive patch that fix the issue of the increasing connections to the LDAP server describe above
Comment 7 Denis Gervalle 2004-04-20 22:13:55 UTC
The previous patch provide a fix to excessive locking done during connection
searching in the connection cache. This correction is also available in the
version 1.23 of util_ldap.c in the CVS tree. This patch contains also the
previous patch related to connection rebind and is appliable to the latest
stable release to date (2.0.49).
Comment 8 Albert Lunde 2004-04-26 22:40:10 UTC
I did a series of tests using 2.0.49 with each of:

1) util_ldap.c from cvs version 1.24

2) util_ldap.c patched with 11296 

3) util_ldap.c patched with 10470 and the change to util_ldap_connection_destroy

I did tests with one data set that stepped through 11 usernames in nearly serial
order, and another that was more of a random walk across the same usernames.
Both data sets included some pseudo-random failures.

It can be summarized as follows:

1) the CVS code left 10 sockets in use at the end of the test until I HUPed the
server. It reached that near level pretty quickly and then stayed there.

The authentication results of the CVS code were entirely unreliable.

For the serial test, 34 good, 69 bad. ("good" is test cases the expected result)
For the random test, 410 good, 394 bad.

(Would you suggest combining CVS with another patch?)

The versions 2) and 3) both returned 100% good results

Version (3) promptly closed all connections (as expected)

Version (2) on the serial test left 9 sockets in use at the end; 
on the random test left 4 sockets in use at the end
Both test data sets showed some reuse of sockets.
Comment 9 Albert Lunde 2004-05-12 17:54:43 UTC
I'm adding this note to document some further tests on patch 11296.

E-mail correspondence with Denis Gervalle, suggested I should test 
the effects of the number of processes. (All my tests are on 
Linux running the prefork model.)

I did the tests above with the default settings of

StartServers         5
MinSpareServers      5
MaxSpareServers     10
MaxClients         150
MaxRequestsPerChild  0

For comparison, I set up a low process number test with:

StartServers         1
MinSpareServers      1
MaxSpareServers     1
MaxClients         150
MaxRequestsPerChild  0

and high process number test with:

StartServers         10
MinSpareServers      10
MaxSpareServers     20
MaxClients         150
MaxRequestsPerChild  0

I ran the serial and random test data against these two new
configurations under 2.0.49 with patch 11296

All the test results were correct; they differed in socket usage.

The "low process" config left 1 socket open to the LDAP server at the
end of both data sets.

The "high process" config left 15 sockets open at the end of the serial
data set and 13 sockets open at the end of the random data set.

Combined with the test above, this seems to indicate that 11296 is
holding sockets between requests on the order of one per process. This
rate of usage looks fairly stable over time. It goes up and down in
tests, but there's no long-term upward trend as there had been with
10470.

In all my tests I'm getting a log message 

[debug] util_ldap.c(1139): LDAP cache: Unable to init 
Shared Cache: no file

which I guess indicates there's no shared state among processes. 
I've tried to explicitly specify a cache file writable by the 
web server, but it does not seem to have any effect.
Comment 10 Graham Leggett 2004-05-20 22:53:29 UTC
Please try the patch at http://nagoya.apache.org/bugzilla/show_bug.cgi?id=27748
and tell me if it fixes this problem. This patch has been applied to v2.1.0-dev,
and awaits backporting to v2.0.50-dev.

This is specifically in reference to the auth failures described.
Comment 11 Graham Leggett 2004-05-21 15:30:28 UTC

*** This bug has been marked as a duplicate of 27748 ***
Comment 12 Albert Lunde 2004-05-24 20:20:46 UTC
I repeated my test set up with the roll-up patch 11618 from bug 27748

The results look good.

I'm now getting no unexpected authentication results, and socket usage looks
similar to Denis Gervalle's previous patch.

I still have the warning "LDAP cache: Unable to init 
Shared Cache: no file", but I suppose that's a different issue.