This blog is a presentation based on the uploaded slides.
We have focused a lot of energy on improving our capabilities to execute complex
queries in an efficient manner. The first step on these improvements we took in
MySQL Cluster 7.6, but even more improvements have been made in
MySQL Cluster 8.0.
There are four types of improvements that we have made. One is that we have added
the option to use a shared memory transporter between the MySQL Server and the
NDB data nodes when they are located on the same machine.
The second improvement is the adaptive CPU spinning described in blog post 1.
The third improvement is that we made Read Backup the default, this means that in
complex queries any replica can be used for reading.
Finally we have added a number of algorithmic improvements, these mainly focus on
enabling more parts of the queries to be pushed down into the NDB data nodes.
We described the improvements in 8.0.20 in blog post 2.
Slide 4 shows a 2-node setup optimised for optimal query latency.
In slide 5 we show the improvements in TPC-H comparing MySQL Cluster 8.0.20 vs
MySQL Cluster 7.6.12. 15 out of 22 queries have improved by more 100% and most
of those even a lot more than 100%. 3 queries have improved betwen 50% and 100%.
Only 4 queries have been unaffected by these improvements.
In slide 6 we see the latency improvements that comes from colocating the MySQL
Server and the NDB data nodes using a shared memory transporter. 14 out of 22
queries improve in this scenario and some of them almost 100% and even above
Slide 7 shows the improvements based on using LatencyOptimisedSpinning as
SpinMethod as described in blog post 3. 13 out of 22 queries sees an improvement
in this scenario. Most of them around 15-20% with a few that improves up to 60%.
Slide 8 shows the sum of the improvements of colocation and CPU spinning.
Slide 9 shows that some queries can benefit from more LDMs, but some also get a
negative impact from this. Thus we conclude that there is a limit to have many
fragments we split tables into for optimal latency. This is an important thing
to consider in any distributed DBMS.