We deal with Very Large Objects in our archive, and the hard-coded 2k buffer size in org/apache/slide/webdav/method/GetMethod.java was killing read performance. Getting a single chunk of data from our archive is a bit expensive, so I implemented a better buffering scheme (with a still hard-coded limit). Unified diff is below. On my local machine, I went from ~115KB/sec reads to ~9MB/sec reads for large objects. --- GetMethod.java.~1.55.~ 2004-11-09 10:00:28.000000000 -0500 +++ GetMethod.java 2004-12-03 22:13:09.000000000 -0500 @@ -70,8 +70,11 @@ // -------------------------------------------------------------- Constants - protected final int BUFFER_SIZE = 2048; + /** The default buffer size */ + protected final int DEFAULT_BUFFER_SIZE = 2048; + /** The maximum buffer size */ + protected final int MAX_BUFFER_SIZE = 200 * 1024; /** @@ -393,7 +396,7 @@ (resourceInputStream, input); // Copy the input stream to the output stream - exception = copyRange(istream, ostream); + exception = copyRange(istream, ostream, resourceInfo.length); // Clean up the input and output streams try { @@ -544,11 +547,12 @@ * @return Exception which occured during processing */ private IOException copyRange(InputStream istream, - ServletOutputStream ostream) { + ServletOutputStream ostream, + long resourceLength) { // Copy the input stream to the output stream IOException exception = null; - byte buffer[] = new byte[input]; + byte buffer[] = new byte[getBufferSize(resourceLength)]; int len = buffer.length; while (true) { try { @@ -591,7 +595,7 @@ IOException exception = null; long bytesToRead = end - start + 1; - byte buffer[] = new byte[input]; + byte buffer[] = new byte[getBufferSize(bytesToRead)]; int len = buffer.length; while ( (bytesToRead > 0) && (len >= buffer.length)) { try { @@ -615,6 +619,16 @@ } + /** + * Get a reasonable buffer size. + */ + private int getBufferSize(long resourceLength) + { + if (resourceLength <= 0) + return DEFAULT_BUFFER_SIZE; + + return (int)Math.min(resourceLength, MAX_BUFFER_SIZE); + } /** * Parse the range header.
Created attachment 13657 [details] Patch to improve buffer size allocation
John, This looks good, and the performance improvements you've mentioned are impressive. I'm concerned that the large maximum buffer size could lead to out of memory errors. Perhaps making the maximum size a configurable parameter so that administrators could choose the best speed/memory ratio? Another solution that might work for your case (since it sounds like the problem is magnified your custom store) is the wrap the InputStream (or Reader) that you pass to the NodeRevisionContent so that it maintains an internal buffer that's larger than the webdav buffer. -James
Hi James, The performance improvement is definitely impacted by how expensive a single read operation is on our distributed archive. I suppose that many clients pulling down many large files at the same time could use a lot of memory, but I wouldn't think it would be an issue for modern machines. A configurable MAX would definitely be preferable. There is also another fixed-size buffer in GetMethod.java which I fixed today, but I don't have the sources handy right now. I'll update the patch tomorrow with the fix for that too if you like. Thanks -John
John, That would be good :). As for getting a parameter, from the SlideToken class you can use getNamespaceConfig().getParameter("parameter-name") to access a parameter defined in the namespace (in Domain.xml). This would be a good place to put the buffer size. -James
Created attachment 13689 [details] Patch rev 2 with configurable max buffer size Now uses a "max-get-buffer-size" config param to set the max buffer size. Defaults to the old 2k size.
Patch committed. Thanks for making the changes, John.