Key-CRDT Stores


This is a master's thesis from the CRDT group, presenting the design and implementation of a key-value CRDT store, named SwiftCloud. SwiftCloud extends the Riak Key-Value store in to a Key-CRDT store by incorporating CRDTs in the system's data-model, namely in the value of the key-value tuple. (By the way, Riak---by Basho inc.--- is a NoSQL database implementing the principles from Amazon's Dynamo paper. Enjoy the brief singing of Riak description here!)

SwiftCloud achieves automatic conflict resolution relying on properties of CRDTs, and provides strong eventual consistency. (This post will make more sense if you read this first) SwiftCloud uses state based replication. Strong eventual consistency between replicas is achieved by merging the states of the replicas. SwiftCloud employs versioned CRDTs to support transactions. Transactions never abort due to write/write conflicts, as the system leverages CRDT properties to merge concurrent transactions.

The Key-Value store interface

A K-V store stores and retrieves byte arrays from the database. In Riak, content is stored as binary data and is identified by (bucket, key) pairs. A (bucket, key) pair has an associated value. Riak provides the following configuration options:

  • # replicas for an object (default is 3)
  • # replicas that reply to a read operation (read quorum)
  • # replicas to write for a write operation (write quorum)
  • strategy for dealing with conflicts (last writer wins or keep multiversions) 

When using the keep multiversions conflict resolution, it is important to always fetch an object before storing it, even when you want to overwrite it. This way the system internally stores a version vector to order the operations; this is to detect that the new value is newer than the previous one and to store both.

SwiftCloud architecture

The interface of the SwiftClient works as a wrapper for Riak Java Client methods. Thus, it allows to fetch and store CRDT objects and do automatic merge of conflicting CRDTs. Fetch and Store operations access stored CRDTs and serialize/deserialize them automatically. To deliver a consistent version of the CRDT to the application, the fetch operation automatically merges the conflicting updates. The clocks associated to CRDTs are used during the merge operation. Clocks are also merged and the resulting clock is associated to the CRDT, before delivering it to the client.

Transactions

SwiftCloud supports transactions with the following properties: Inside a transaction, the application accesses a snapshot of the database, which includes all CRDTs in the system. All updates of a transaction are executed atomically. Transactions never abort --concurrent updates are merged using CRDT rules.

The transactional system of SwiftCloud builds on the convergence properties of CRDTs and the multi-versioning support provided by the Versioned CRDTs. The unique identifiers stored as part of multiversioned CRDTS enables the system to reconstruct the evolution of a CRDT or undo operations.

The evaluations in the thesis show that the non-transactional SwiftCloud has little impact on performance. The transactional SwiftCloud imposes only a small overhead due to the extra communication steps in the protocol for executing transactions.

Conclusions

Since this works builds heavily on the CRDT idea, it is prone to the same drawback of not providing meaningful invariants across replicas. Current CRDT designs are not suitable to some applications, as they cannot preserve data invariants, such as not allowing an integer to become negative. Future work will try to address that question.

Comments

Popular posts from this blog

The end of a myth: Distributed transactions can scale

Hints for Distributed Systems Design

Foundational distributed systems papers

Learning about distributed systems: where to start?

Metastable failures in the wild

Scalable OLTP in the Cloud: What’s the BIG DEAL?

SIGMOD panel: Future of Database System Architectures

The demise of coding is greatly exaggerated

Dude, where's my Emacs?

There is plenty of room at the bottom