Bug 32546 - GET performance slow on large files
Summary: GET performance slow on large files
Status: RESOLVED FIXED
Alias: None
Product: Slide
Classification: Unclassified
Component: WebDAV Server (show other bugs)
Version: Nightly
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---
Assignee: Slide Developer List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-12-06 20:36 UTC by John Rousseau
Modified: 2004-12-08 18:31 UTC (History)
0 users



Attachments
Patch to improve buffer size allocation (2.00 KB, patch)
2004-12-06 20:40 UTC, John Rousseau
Details | Diff
Patch rev 2 with configurable max buffer size (4.49 KB, patch)
2004-12-08 23:52 UTC, John Rousseau
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description John Rousseau 2004-12-06 20:36:26 UTC
We deal with Very Large Objects in our archive, and the hard-coded 2k buffer
size in org/apache/slide/webdav/method/GetMethod.java was killing read
performance. Getting a single chunk of data from our archive is a bit expensive,
so I implemented a better buffering scheme (with a still hard-coded limit).
Unified diff is below.

On my local machine, I went from ~115KB/sec reads to ~9MB/sec reads for large
objects.

--- GetMethod.java.~1.55.~	2004-11-09 10:00:28.000000000 -0500
+++ GetMethod.java	2004-12-03 22:13:09.000000000 -0500
@@ -70,8 +70,11 @@
     // -------------------------------------------------------------- Constants
 
 
-    protected final int BUFFER_SIZE = 2048;
+    /** The default buffer size */
+    protected final int DEFAULT_BUFFER_SIZE = 2048;
 
+    /** The maximum buffer size */
+    protected final int MAX_BUFFER_SIZE = 200 * 1024;
 
 
     /**
@@ -393,7 +396,7 @@
             (resourceInputStream, input);
 
         // Copy the input stream to the output stream
-        exception = copyRange(istream, ostream);
+        exception = copyRange(istream, ostream, resourceInfo.length);
 
         // Clean up the input and output streams
         try {
@@ -544,11 +547,12 @@
      * @return Exception which occured during processing
      */
     private IOException copyRange(InputStream istream,
-                                  ServletOutputStream ostream) {
+                                  ServletOutputStream ostream,
+                                  long resourceLength) {
 
         // Copy the input stream to the output stream
         IOException exception = null;
-        byte buffer[] = new byte[input];
+        byte buffer[] = new byte[getBufferSize(resourceLength)];
         int len = buffer.length;
         while (true) {
             try {
@@ -591,7 +595,7 @@
         IOException exception = null;
         long bytesToRead = end - start + 1;
 
-        byte buffer[] = new byte[input];
+        byte buffer[] = new byte[getBufferSize(bytesToRead)];
         int len = buffer.length;
         while ( (bytesToRead > 0) && (len >= buffer.length)) {
             try {
@@ -615,6 +619,16 @@
 
     }
 
+    /**
+     * Get a reasonable buffer size.
+     */
+    private int getBufferSize(long resourceLength)
+    {
+        if (resourceLength <= 0)
+            return DEFAULT_BUFFER_SIZE;
+
+        return (int)Math.min(resourceLength, MAX_BUFFER_SIZE);
+    }
 
     /**
      * Parse the range header.
Comment 1 John Rousseau 2004-12-06 20:40:45 UTC
Created attachment 13657 [details]
Patch to improve buffer size allocation
Comment 2 James Mason 2004-12-07 06:12:07 UTC
John,

This looks good, and the performance improvements you've mentioned are
impressive. I'm concerned that the large maximum buffer size could lead to out
of memory errors. Perhaps making the maximum size a configurable parameter so
that administrators could choose the best speed/memory ratio?

Another solution that might work for your case (since it sounds like the problem
is magnified your custom store) is the wrap the InputStream (or Reader) that you
pass to the NodeRevisionContent so that it maintains an internal buffer that's
larger than the webdav buffer.

-James
Comment 3 John Rousseau 2004-12-08 03:41:30 UTC
Hi James,

The performance improvement is definitely impacted by how expensive a single 
read operation is on our distributed archive. I suppose that many clients 
pulling down many large files at the same time could use a lot of memory, but 
I wouldn't think it would be an issue for modern machines.

A configurable MAX would definitely be preferable.

There is also another fixed-size buffer in GetMethod.java which I fixed today, 
but I don't have the sources handy right now. I'll update the patch tomorrow 
with the fix for that too if you like.

Thanks
-John
Comment 4 James Mason 2004-12-08 09:01:59 UTC
John,

That would be good :).

As for getting a parameter, from the SlideToken class you can use
getNamespaceConfig().getParameter("parameter-name") to access a parameter
defined in the namespace (in Domain.xml). This would be a good place to put the
buffer size.

-James
Comment 5 John Rousseau 2004-12-08 23:52:57 UTC
Created attachment 13689 [details]
Patch rev 2 with configurable max buffer size

Now uses a "max-get-buffer-size" config param to set the max buffer size.
Defaults to the old 2k size.
Comment 6 James Mason 2004-12-09 03:31:44 UTC
Patch committed.

Thanks for making the changes, John.