"... so, what do you think?"

"It was cute, in a meta sort of way."

"Thanks. But there's something a little sad about resorting to meta cuteness so often, like I'm an n-trick pony for undisclosed but probably quite small n."

"Well, you're very good at meta cuteness."

"There is that."

Consistent Hashing

Dear reader, suppose you're a distibuted data storage system. Your soul (although some pedants would insist on the word program) is dispersed across a cluster of several networked computers. From time to time, your human patrons give you files, and your job—more than that, the very purpose of your existence—is to store these files for safekeeping and later retrieval.

The humans who originally crafted your soul chose a simple algorithm as the means by which you decide which file goes on which of the many storage devices that live in the computers you inhabit: you find the MD5 hash of the filename, take its residue modulo n where n is the number of devices you have—let's call the result i—and you put the file on the (zero-indexed) ith device. So when you had sixteen devices and the humans wanted you to store twilight.pdf, you computed md5("twilight.pdf") = 429eb07bb8a3871c431fe03694105883, saw that the lowest nibble was 3, and put the file on your 3rd device (most humans would say the fourth device, counting from one).

It's not a bad system, you tell yourself (some sort of pride or loyalty preventing you from disparaging your creators' efforts, even to yourself). At least it keeps the data spread out evenly. (A shudder goes down your internal buses as you contemplate what disasters might have happened if your creators had been even more naive and, say, had you put files with names starting with A through D on the first device, &c. What would have happened that time when your patrons decided they wanted to store beat00001.mp3 through beat18691.mp3?)

Continue reading