Monday, September 10, 2018

Non-blocking Two-phase commit in NDB Cluster

Non-blocking 2PC protocol

Many of the new DBMSs developed in the last 10 years have abandoned the
two-phase commit protocol and instead relied on replication protocols.

One of the main reasons for this has been the notion that two-phase commit
protocol is a blocking protocol. This is true for the classic version of the
two-phase commit protocol.

When NDB Cluster was developed in the 1990s we had requirements that
the replication protocol could not be blocking. A competitor at the time,
ClustRa, solved this by using a backup transaction coordinator. Given that
NDB Cluster had requirements to survive multiple simultaneous node failures,
this wasn't sufficient.

Thus a new two-phase commit protocol was developed that is completely
non-blocking. The main idea is that one uses a take-over protocol, this means
that any number of nodes can crash and we can still handle it as long as there
is enough nodes to keep all data available.

In addition NDB Cluster is designed both for Disk Durable transactions
and Network Durable transactions. Disk Durable transactions requires
data to be durable on disk when the transaction have committed and
Network Durable requires that the transaction is on at least 2 computers
when the transaction is committed.

Due to the response time requirements for applications that NDB Cluster
was designed for, we implemented it such that when applications received
the response the transaction was Network Durable.

The Disk Durability is handled in a background phase where data is
consistently flushed to disk such that we can always recover a consistent
version of the data even in the presence of a complete failure of the
cluster.

This part is handled by the Global Checkpoint protocol. The PDF above
describes the transaction protocol and the global checkpoint protocol that
together implement the Network Durability and Disk Durability of NDB
Cluster.

No comments: