January 30, 2018
Manuel Bravo
The problem of ensuring consistency in applications that manage replicated data is one of the main challenges of distributed computing. The observation that delegating consistency management entirely to the programmer makes the application code error prone and that strong consistency conflicts with availability has spurred the quest for meaningful consistency models, that can be supported effectively by the data service. Among the several invariants that may be enforced, ensuring that updates are applied and made visible respecting causality has emerged as a key ingredient among the many consistency criteria and client session guarantees that have been proposed and implemented in the last decade. Mechanisms to preserve causality can be found in systems that offer from weaker to stronger consistency guarantees. In fact, causal consistency is pivotal in the consistency spectrum, given that it has been proved to be the strongest consistency model that does not compromise availability.
In this talk, I present a novel metadata service that can be used by geo-replicated data services to efficiently ensure causal consistency across geo-locations. Its design brings two main contributions: • It eliminates the tradeoff between throughput and data freshness inherent to previous solutions. To avoid impairing throughput, our service keeps the size of the metadata small and constant, independently of the number of clients, servers, partitions, and locations. By using clever metadata propagation techniques, we also ensure that the visibility latency of updates approximates that of weak-consistent systems that are not required to maintain metadata or to causally order operations. • It allows data services to fully benefit from partial geo-replication, by implementing genuine partial replication, requiring datacenters to manage only the metadata concerning data items replicated locally.