This blog post is referring to the uploaded slides.
DBT2 is based on the standard benchmark TPC-C. This benchmark is a mix of read
and write activity. It is much more write intensive than many other benchmarks.
This has resulted that performance in NDB in previous versions have only scaled
to nodes with around 6 LDM threads. With the introduction of multiple sockets
for communication between nodes we have been able to remove this limitation.
Thus DBT2 serves as a good tool to verify that MySQL Cluster 8.0.20 has improved
its capability to scale writes.
To our disposal for those tests we had access to a set of bare metal servers in
the Oracle Cloud (OCI). The data nodes used DenseIO2 bare metal servers with
52 CPU cores, 768 GByte of memory and 8 NVMe drives. We had 6 such servers,
these 6 servers were spread in 3 different availability domains.
The queries was executed by MySQL Servers, we used 15 bare metal servers with
36 CPU cores each to run the MySQL Servers. These servers are spread such that
5 servers are located in each availability domain.
The actual benchmark driver and client was executed on its own bare metal server.
DBT2 has five different transaction types. Each transaction is a complex transaction
consisting of a set of SQL queries. On average each transaction consists of around
30-35 SQL queries. Only NewOrder transactions are counted in the results. These
constitute about 45% of the transactions. This means that if the report says 45.000
TPM (transactions per minute), it means that actually 100.000 transactions were
DBT2 doesn't follow the standard TPC-C since each terminal continues immediately
without waiting before it starts a new transaction. This makes it possible to also
use DBT2 to test in-memory DBMS. Interestingly however with persistent memory a
server can have 6 TB of memory and this means that more than 50.000 warehouses
can be used and thus results beyond 500.000 TPM can be handled even with an
Each SQL query is created in the DBT2 driver, sent to the DBT2 client and from there
it is sent to the MySQL Server and from there it accesses a set of NDB data nodes to
perform the actual query.
Since the servers are located in different availability domains the main cause of
latency comes from the communication between availability domains. When executing
DBT2 in a local network one can achieve latency down to 5-10 ms for a transaction
that contains around 30 SQL statements. In this environment we get latencies of
around 30 ms. Most of this time is spent communicating from the DBT2 client to the
MySQL Servers when they are located in a different availability domain.
NDB has a configuration option that ensures that any reads will always stay local
in the availability domain of the MySQL Server it was sent to. Updates need to
update all replicas and this might require communication over availability domains.
We have selected 3 different configurations to execute DBT2 in.
The first configuration is shown in slide 5. It uses 2 data nodes, these are both
located in the same availability domain. They are however located in different
failure domains and thus only share electricity network. We use 10 MySQL servers
from 2 different availability domains.
The second configuration shows a setup where we still have 2 replicas, but now we
have one "shard" per availability domain. Thus replicas are still in the same
availability domain. This means that we use all 6 data node servers. We also use
all 15 MySQL servers in 3 different availability domains.
The third configuration we use 3 replicas where the replicas are in different
availability domains. In this case we get 2 node groups using the 6 data nodes.
There is still 15 MySQL Servers in 3 different availability domains.
The first configuration reaches almost 1.4M TPM at 2560 connections. The second
configuration with 3 node groups reaches almost linear scaling and reaches
a bit beyond 4M TPM with 12000 connections.
Finally the last configuration reaches around 2.5M TPM at 8400 connections.
Thus we can see that performance is fairly constant per node group. The latency
differs and the number of connections required to reach optimal throughput
differs a bit.
It is very clear that the multi-socket solution has improved the scalability of
writes and thus ensuring that NDB can reach very high numbers using DBT2.
In this benchmark we showed NDB scale to using more than 1000 CPUs in the MySQL
Servers and the NDB data nodes. MySQL NDB Cluster 8.0 can scale to 144 data
nodes and thus can scale far beyond those numbers.