LRUCache.CacheEntry uses String.length() to compute the size of a string in bytes. The specification states that String.length() return the number of 16- bit Unicode characters in the string. Unicode characters are usually represented using two bytes. Hence the real size is actually twice as big as the store one. Another minor issue is that size was only registered when lifetime was superior to zero. Below is a modified version of the constructor: public CacheEntry(String value) { this.value = value; if (lifetime > 0) { this.expiration = (new Date()).getTime() + lifetime; } // Pitch's fix: String.length() return the number of 16-bit Unicode // characters in the string. Unicode characters are usually represented // using two bytes. For Chinese characters or new Unicode schema, we // would be closer to reality by calling: String.getBytes().length this.size = 2 * value.length(); }
Hi Pitch - Thanks for the note. I think the most sensible way to keep things consistent is simply to avoid concerning ourselves with the actual size of the bytes and instead focus on the characters, which provide a convenient abstraction and a functional enough one for most purposes. Accordingly, in true "it's a feature, not a bug" style, I have updated the documentation to make this clearer. I've addressed the other issue; we indeed need to compute size in all cases, not just for caches that expire their elements by time. Thanks again, Shawn
Fair enough. Counting the bytes or the chars serves the same purpose that is to limit the total size of the cache to a reasonable amount. --Pitch