Bug 49822 - Add hash lb worker method
Summary: Add hash lb worker method
Status: NEW
Alias: None
Product: Tomcat Connectors
Classification: Unclassified
Component: Common (show other bugs)
Version: 1.2.30
Hardware: All All
: P2 enhancement (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords: PatchAvailable
Depends on:
Blocks:
 
Reported: 2010-08-25 12:28 UTC by Ryan Fox
Modified: 2021-01-29 23:04 UTC (History)
1 user (show)



Attachments
Patch to add hash lb worker method (3.13 KB, application/octet-stream)
2010-08-25 12:28 UTC, Ryan Fox
Details
Patch to add hash lb worker method (3.95 KB, patch)
2010-08-25 17:23 UTC, Ryan Fox
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Fox 2010-08-25 12:28:50 UTC
Created attachment 25938 [details]
Patch to add hash lb worker method

The attached patch adds a hash method to the LB worker.  This is useful for scenarios where back-end cache affinity is more desirable than other load balancing methods.

This works by taking a hash of the req_uri + query_string, seeding the random number generator with this value, and using the first random number returned mod num_of_workers.  This will consistently send a given uri/query string to a particular worker.  If that worker is unavailable, fall back to default method.
Comment 1 Ryan Fox 2010-08-25 17:23:39 UTC
Created attachment 25941 [details]
Patch to add hash lb worker method

Previous patch was incomplete
Comment 2 Christopher Schultz 2017-09-01 14:42:26 UTC
Note the age of the patch.
Comment 3 Mark Thomas 2021-01-29 16:44:39 UTC
Comment on attachment 25941 [details]
Patch to add hash lb worker method

Repair vandalism
Comment 4 Christopher Schultz 2021-01-29 23:04:44 UTC
I question the quality of this patch. There are a number of issues that I see:

1. The combining of the URI and query string is completely wrong. Only the query string is actually used. I'm not sure if s->query_string can be NULL (instead of simply empty-string), in which case this will crash the process.

2. There is no ensuring that the "req" "combined" URI+query-string is null-terminated. Out-of-bounds read is possible, given that the client controls the inputs.

3. There is a race-condition between seeding the PRNG (which is a process-wide change) and obtaining a random number from the PRNG. Two threads running at the same moment can step on each other and the random numbers may not be correct.

4. After hashing, why bother initializing a random-number generator and then generating a single random number? You already have a nice "random number" generated: the hash you just computed. Just use the hash directly.

Also, why not use a standard hashing algorithm? The one here appears to be the ELF hash algorithm[1] which is, I suppose, standard enough. I looked at libapr and it doesn't look like it has any specific hashing algorithms implemented, unless I am very much mistaken. I suppose ELF hash is as good as any out there. Suggestions are welcome.

[1] http://www.sco.com/developers/devspecs/gabi41.pdf pages 94-95, figure 5-12. Also https://en.wikipedia.org/wiki/PJW_hash_function
I wonder if the ELF hash algorithm requires some kind of attribution. It's such a short bit of code it can't really be implemented any other way.