Thoughts on Optimistic Availability

The CAP Theorem is an important foundation for understanding distributed systems. Summarizing all the available literature on CAP and some modern interpretations of the theorem, we know that any distributed system must decide when dealing with latency or network failure, whether it will choose Consistency or Availability.
  • Consistency - The result of a read at T1 is the same at T2 if there were no writes between T1 and T2. Basically, any read can see any previously completed write.

  • Availability - The system can can respond to a request in a timely manner.

Wikipedia defines Consistency as all nodes see the same data at the same time, but that is incorrect because CAP is defined by the observed rather than actual state of the system.

Most NoSQL database choose availability over consistency (i.e. eventual consistency), which means they cannot provide ACID transactions. Since horizontal scalability and transactions are both important, we should create a system that chooses consistency over availability, but makes every effort to minimize latency and handle some network failure.

What is Optimistic Availability?

Optimistic availability is a characteristic of distributed systems that are coordinated by a master node which allows the system to tolerate arbitrary node failure while preserving availability as long as the master node and at least one relevant node agree on the state of the system. In laymen's terms: given sufficient replication, if a node fails, we're optimistic that the system will remain available.

  • The master node is a single point of failure. It is possible to have multiple masters and/or leader election in the event that the master goes down.
  • The master node is only required to coordinate state changes and does not, itself, need to store much data beyond system-wide state (i.e. latest version, etc).
  • If the master node is available, we can be optimistic that the system is available (the read or write will be serviced) as long as one relevant replica with the same state is available.
  • We can guarantee CID because a single operation:
    • will transition the database from one consistent state to another if the database is available or fail
    • be done in isolation because the master node will write lock its buffer of writes to coordinate concurrent operations (serializability) AND the WRITE PROTOCOL ensures that writes are only sent to nodes that were previously available (meaning they are the correct sequence of writes) and writes are only sent to previously unavailable nodes sequentially during the REPAIR PROCESS

    • will be held on the master node if there is at least one node that is not available to ensure that it is never lost

  • In order to get A (atomicity) and full transactions, the master node must store atomic groups of operations in a buffer and replay them against the current state of the system when its time to commit (TODO: think about this some more)