Created attachment 25199 [details] MD5 speedup patch The patch attached eliminates unnecessary buffer copy and clean-up operations on little-endian machines (e.g. x86 & friends). Calling decode() will convert the buffer 'block' from little-endian to machine-endian in 'x'. That is only necessary on big-endian machines. Net runtime savings: ~15%. [[[ If we are on a little-endian machine, we can read directly from the input buffer. * crypto/apr_md5.c (MD5Transform): optmimize by translating buffers & cleaning them up on big-endian machines only. patch by stefanfuhrmann < at > alice-dsl.de ]]]
I don't think your patch is correct in general because block may be 1-byte aligned but x must be 4-byte aligned. It would work on x86 but it may fail on other architectures that don't support unaligned access. One has to check the alignment and do a memcpy if the input block is not properly aligned.
committed different version with alignment checks: r1460244
1.5 commit: r1460281
fixed in 1.5.2