Mikael Ronstrom: TPC-H Improvements in NDB Cluster

This blog is a presentation based on the uploaded slides.

We have focused a lot of energy on improving our capabilities to execute complex

queries in an efficient manner. The first step on these improvements we took in

MySQL Cluster 7.6, but even more improvements have been made in

MySQL Cluster 8.0.

There are four types of improvements that we have made. One is that we have added

the option to use a shared memory transporter between the MySQL Server and the

NDB data nodes when they are located on the same machine.

The second improvement is the adaptive CPU spinning described in blog post 1.

The third improvement is that we made Read Backup the default, this means that in

complex queries any replica can be used for reading.

Finally we have added a number of algorithmic improvements, these mainly focus on

enabling more parts of the queries to be pushed down into the NDB data nodes.

We described the improvements in 8.0.20 in blog post 2.

Slide 4 shows a 2-node setup optimised for optimal query latency.

In slide 5 we show the improvements in TPC-H comparing MySQL Cluster 8.0.20 vs

MySQL Cluster 7.6.12. 15 out of 22 queries have improved by more 100% and most

of those even a lot more than 100%. 3 queries have improved betwen 50% and 100%.

Only 4 queries have been unaffected by these improvements.

In slide 6 we see the latency improvements that comes from colocating the MySQL

Server and the NDB data nodes using a shared memory transporter. 14 out of 22

queries improve in this scenario and some of them almost 100% and even above

100%.

Slide 7 shows the improvements based on using LatencyOptimisedSpinning as

SpinMethod as described in blog post 3. 13 out of 22 queries sees an improvement

in this scenario. Most of them around 15-20% with a few that improves up to 60%.

Slide 8 shows the sum of the improvements of colocation and CPU spinning.

Slide 9 shows that some queries can benefit from more LDMs, but some also get a

negative impact from this. Thus we conclude that there is a limit to have many

fragments we split tables into for optimal latency. This is an important thing

to consider in any distributed DBMS.

Mikael Ronstrom