Bug 32546

Summary: GET performance slow on large files
Product: Slide Reporter: John Rousseau <JRRousseau>
Component: WebDAV ServerAssignee: Slide Developer List <slide-dev>
Severity: normal    
Priority: P3    
Version: Nightly   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Attachments: Patch to improve buffer size allocation
Patch rev 2 with configurable max buffer size

Description John Rousseau 2004-12-06 20:36:26 UTC
We deal with Very Large Objects in our archive, and the hard-coded 2k buffer
size in org/apache/slide/webdav/method/GetMethod.java was killing read
performance. Getting a single chunk of data from our archive is a bit expensive,
so I implemented a better buffering scheme (with a still hard-coded limit).
Unified diff is below.

On my local machine, I went from ~115KB/sec reads to ~9MB/sec reads for large

--- GetMethod.java.~1.55.~	2004-11-09 10:00:28.000000000 -0500
+++ GetMethod.java	2004-12-03 22:13:09.000000000 -0500
@@ -70,8 +70,11 @@
     // -------------------------------------------------------------- Constants
-    protected final int BUFFER_SIZE = 2048;
+    /** The default buffer size */
+    protected final int DEFAULT_BUFFER_SIZE = 2048;
+    /** The maximum buffer size */
+    protected final int MAX_BUFFER_SIZE = 200 * 1024;
@@ -393,7 +396,7 @@
             (resourceInputStream, input);
         // Copy the input stream to the output stream
-        exception = copyRange(istream, ostream);
+        exception = copyRange(istream, ostream, resourceInfo.length);
         // Clean up the input and output streams
         try {
@@ -544,11 +547,12 @@
      * @return Exception which occured during processing
     private IOException copyRange(InputStream istream,
-                                  ServletOutputStream ostream) {
+                                  ServletOutputStream ostream,
+                                  long resourceLength) {
         // Copy the input stream to the output stream
         IOException exception = null;
-        byte buffer[] = new byte[input];
+        byte buffer[] = new byte[getBufferSize(resourceLength)];
         int len = buffer.length;
         while (true) {
             try {
@@ -591,7 +595,7 @@
         IOException exception = null;
         long bytesToRead = end - start + 1;
-        byte buffer[] = new byte[input];
+        byte buffer[] = new byte[getBufferSize(bytesToRead)];
         int len = buffer.length;
         while ( (bytesToRead > 0) && (len >= buffer.length)) {
             try {
@@ -615,6 +619,16 @@
+    /**
+     * Get a reasonable buffer size.
+     */
+    private int getBufferSize(long resourceLength)
+    {
+        if (resourceLength <= 0)
+            return DEFAULT_BUFFER_SIZE;
+        return (int)Math.min(resourceLength, MAX_BUFFER_SIZE);
+    }
      * Parse the range header.
Comment 1 John Rousseau 2004-12-06 20:40:45 UTC
Created attachment 13657 [details]
Patch to improve buffer size allocation
Comment 2 James Mason 2004-12-07 06:12:07 UTC

This looks good, and the performance improvements you've mentioned are
impressive. I'm concerned that the large maximum buffer size could lead to out
of memory errors. Perhaps making the maximum size a configurable parameter so
that administrators could choose the best speed/memory ratio?

Another solution that might work for your case (since it sounds like the problem
is magnified your custom store) is the wrap the InputStream (or Reader) that you
pass to the NodeRevisionContent so that it maintains an internal buffer that's
larger than the webdav buffer.

Comment 3 John Rousseau 2004-12-08 03:41:30 UTC
Hi James,

The performance improvement is definitely impacted by how expensive a single 
read operation is on our distributed archive. I suppose that many clients 
pulling down many large files at the same time could use a lot of memory, but 
I wouldn't think it would be an issue for modern machines.

A configurable MAX would definitely be preferable.

There is also another fixed-size buffer in GetMethod.java which I fixed today, 
but I don't have the sources handy right now. I'll update the patch tomorrow 
with the fix for that too if you like.

Comment 4 James Mason 2004-12-08 09:01:59 UTC

That would be good :).

As for getting a parameter, from the SlideToken class you can use
getNamespaceConfig().getParameter("parameter-name") to access a parameter
defined in the namespace (in Domain.xml). This would be a good place to put the
buffer size.

Comment 5 John Rousseau 2004-12-08 23:52:57 UTC
Created attachment 13689 [details]
Patch rev 2 with configurable max buffer size

Now uses a "max-get-buffer-size" config param to set the max buffer size.
Defaults to the old 2k size.
Comment 6 James Mason 2004-12-09 03:31:44 UTC
Patch committed.

Thanks for making the changes, John.