In eventually consistent systems, when a node failures or network partition occurs, we're presented with a trade-off: to execute a request and sacrifice consistency or reject execution and sacrifice availability. In such system, quorums, overlapping node subsets guaranteeing at least one node to hold the most recent value, can be a good middle-ground. We can tolerate failures and loss of connectivity for some nodes while still serving latest results.
Quorum-based replication schemes incur high storage costs: we have to store redundant values on several nodes to guarantee enough copies are going to be available in case of failure. It turns out that we do not have to store data on each replica. We can reduce storage and compute resources by storing the data only a subset of nodes, and only use the other nodes (Transient Replicas), for redundancy in failure scenarios.
In this talk, we discuss Witness Replicas, a replication scheme used in Spanner and Megastore, and Apache Cassandra implementation of this concept, called Transient Replication and Cheap Quorums.
Download presentation