Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Given a cluster with an arbitrary number of nodes, we want to spread the data across those nodes so we reduce the amount of load that a single node bears. The idea is that, in a system with constant requests, load will be distributed throughout the system because data that is relevant to different requests are on different nodes. We also want to have data stored on multiple nodes so that we can continue to respond to queries in the event or arbitrary network/node failure.

Partition Factor

The partionpartition_factor (pf) is a number [1, 100] that determines the partition pool (pp), or the maximum number of nodes across which data for a particular locator are stored. This is the upper bound on the number of nodes we need to query for a read that touches the entire locator (i.e. audit(record)).

pp =  [ 1 , ceilor ceil(pf * n)], where n is the number of nodes.

Redundancy Factor

The redundancy_factor factor (rf) is a number [1, 100] that determines the redundancy pool (rp) or the number of nodes which data for a particular key/locator pair are stored. This is the exact number of nodes that are eligible to respond to a query for a read that touches a key in a record. The redundancy pool is a subset of the partition pool.

rp =  [ 1 , ceil(rf * pp)] or ceil(rf * pp)

Distribution

012345

Consider a cluster that has 6 nodes. We're also making a few assumptions that are worth calling out:

  • The server prefs file contains an order (comma separated) list of node addresses
  • The server reads that list of nodes on startup and caches the list somewhere on disk
  • If the server starts up and notices the list of nodes in the prefs file is different from the one it cached, then it needs to do a rebalance!

Look on http://en.wikipedia.org/wiki/List_of_hash_functions for a 64 bit hash function OR use a stronger hash function but only take the first 8 bytes out the output.

Code Block
ByteBuffer buffer = ByteBuffer.wrap(hashArray);
long hash = buffer.getLong();

Locator Hashing

Code Block
// We need to get pp distinct nodes for the pool, so we continually hash the hash of the locator until we have everything we need.
Integer[] partitionPool = new Integer[pp]
Long hash = null;
for(int i = 0; i < pp ; i++){
	int slot = -1;
	while(slot == -1 || partitionPool[slot] != null){
		slot = hash(hash == null ? locator : hash) % n; //n is number of nodes
	}
	partitionPool[i] = slot;
}
// We now have an array of node identifiers that form our partition pool.

Locator and Key Hashing

Code Block
// We need to get rp nodes from the partition pool, so we continually hash the hash of the locator and key until we have everything we need
Integer[] redundancyPool = new Integer[rp]
Long hash = null;
for(int i = 0; i < rp; i++){
	int slot = -1;
	while(slot == -1 || redundancyPool[slot] != null){
		slot = hash(hash == null ? locator + key : hash) % partitionPool.length;
	}
	redundancyPool[i] = partitionPool[slot];
}
//We now have an array of node identifiers that for our redundancy pool.