Bug 48130

Summary: DAV operations on large filesets consume all the OS memory
Product: Apache httpd-2 Reporter: Ville Jussila <ville>
Component: mod_davAssignee: Apache HTTPD Bugs Mailing List <bugs>
Status: NEW ---    
Severity: major CC: Diego.SantaCruz, mariogruber, ville
Priority: P2 Keywords: PatchAvailable
Version: 2.5-HEAD   
Target Milestone: ---   
Hardware: Other   
OS: All   
Bug Depends on:    
Bug Blocks: 52123    
Attachments: mod_dav / mod_dav_fs memory reduction patch 1/5
mod_dav / mod_dav_fs memory reduction patch 2/5
mod_dav / mod_dav_fs memory reduction patch 3/5
mod_dav / mod_dav_fs memory reduction patch 4/5
mod_dav / mod_dav_fs memory reduction patch 5/5

Description Ville Jussila 2009-11-04 13:00:38 UTC
Compiled apache with --enable-dav with gcc version 4.2.0. AIX OS level is 5300-08-04-0844. 

Enabled DAV for NetApp vFiler filesystem (=network mounted) including up to 57 directories each holding 3000 to 20000 files. Filesystem is holding approximately 70 Gigabytes of data. 

Apache2 installation and DavLockDB located on SAN-volume. 

Listing directories via a browser (without DAV) does not make httpd to increase memory allocation when browsing through directories. Everything works nicely.  

Listing directories via Windows XP SP3 DAV-client makes httpd to allocate more and more memory when opening directory after directory. Everythings seems to work, but more and more memory is allocated and nothings seems to be freed. In the end, all the AIX memory is allocated by the httpd and AIX starts to kill processes. 
 

DAV enabled with : 
<Directory "/netapp/...">
    Dav On
    Options Indexes
    Order Allow,Deny
    Allow from all

    AuthType Basic
    AuthBasicProvider file
    AuthUserFile /home/.../httpd/conf/passwords
    AuthGroupFile /home/.../httpd/conf/groups
Comment 1 vuser1 2009-12-21 13:14:54 UTC
Same problem on Win32. There must be memory leak. Steps:

Step1. On Win2003 client, right-click WebDAV folder properties. Client calculates folder size - enumerates subfolders. 
Result: 117068 Files, 11537 sub-folders, server process httpd.exe has 602Mb WorkingSet and 629Mb PrivateBytes. 

Step2. On server, execute empty.exe ResourseKit utility which calls SetProcessWorkingSetSize(-1, -1) for httpd.
Result: httpd.exe WorkingSet decreases to 2Mb

Step3. On client, close properties and open again.
Result: 610Mb WorkingSet and 900Mb PrivateBytes. 

	<Directory "H:/sklad"> 
		Allow from all	
		Options FollowSymLinks Indexes
		AllowOverride Limit
		ReadmeName /README.html
		HeaderName /HEADER.html

		Dav On
		AuthType Digest
		AuthDigestDomain /files/sklad
		AuthName "sklad"
		AuthDigestProvider file
Comment 2 Stefan Fritsch 2010-03-07 13:11:00 UTC
Memory management in mod_dav is broken. A PROPFIND of a dir with 10000 files needs around 140MB of memory on my system.

Some initiative to overhaul pool usage was described here:

http://mail-archives.apache.org/mod_mbox/httpd-dev/200305.mbox/<86smqzgcuf.fsf%40kepler.ch.collab.net>

Unfortunately this was never completed.

Probably the pool guidelines from subversion should be used:

http://subversion.apache.org/docs/community-guide/conventions.html#apr-pools

This would mean that most mod_dav functions would need to tak a scratch pool and a result pool as parameters. And dav_push_error needs some magic to ensure correct lifetime of the error stack.

In addition there is the problem that the response is streamed to the client while locks are held (PR 36636).

This looks like a large amount of work. Probably a project for Apache 3.0 and not for 2.4.
Comment 3 Diego Santa Cruz 2012-01-30 15:17:28 UTC
We have also come into this problem in our product. As in our case httpd runs on an embedded device memory is a bit constrained, but some users still create directories with more than a thousand files per directory which uses up a lot of memory.

To fix the situation we have done a few patches to mod_dav and mod_dav_fs to remove as much O(N) dependencies on the number of files per directory. We have had them in production since long with no issues, so I am pushing them to be considered for inclusion in httpd. Note that the patches are probably not perfect but still they do improve the situation significantly. The only remaining O(N) source I see is the lock DB, but that is O(N) in the number of locks, so I think that is much less of an issue.

Patches against httpd 2.2.21 follow.
Comment 4 Diego Santa Cruz 2012-01-30 15:19:58 UTC
Created attachment 28232 [details]
mod_dav / mod_dav_fs memory reduction patch 1/5

Use a subpool while doing iteration on dir entries in dav_fs_walker to reduce
memory usage, however this does not help on *nix since apr_stat does not use
the pool in those platforms.
Comment 5 Diego Santa Cruz 2012-01-30 15:23:25 UTC
Created attachment 28233 [details]
mod_dav / mod_dav_fs memory reduction patch 2/5

Use subpool when allocating a propdb and introduce function variants that
allow destroying the propdb pool when the propdb of a resource is closed
(dav_close_propdb2() and dav_get_props2()).
Then use these new functions to destroy the propdb pool after use in dav_propfind_walker() reducing memory usage to O(1) instead of O(N) with the number of resources in a collection and greatly helps PROPFIND.

So far these new functions are not exported from the mod_dav module, that might be considered when an mod_dav API addition is welcome.
Comment 6 Diego Santa Cruz 2012-01-30 15:24:20 UTC
Created attachment 28234 [details]
mod_dav / mod_dav_fs memory reduction patch 3/5

Use a scratch pool when walking the tree for DAV copy and move requests so
that memory usage is O(1) instead of O(N) with the number of resources
per directory; note however that it is still O(N) with the number of
directories in the walked hierarchy.
Comment 7 Diego Santa Cruz 2012-01-30 15:25:37 UTC
Created attachment 28235 [details]
mod_dav / mod_dav_fs memory reduction patch 4/5

When doing a PROPFIND mod_dav creates a subrequest for some live properties
(e.g., getcontenttype and getcontentlanguage) but does not destroy the
subrequest, thus the associated memory is not freed until the main request
is done. This patch destroys the subrequest when a propdb is closed,
significantly reducing the amount of memory when doing a PROPFIND for all
properties or a set of properties requiring subrequests.
Comment 8 Diego Santa Cruz 2012-01-30 15:26:20 UTC
Created attachment 28236 [details]
mod_dav / mod_dav_fs memory reduction patch 5/5

mod_dav_fs' dav_fs_walker() always uses the same pool during the tree walk,
this patch changes that to using subpools for child resources to avoid
having an O(N) memory consumption on the number of resources per dir.