Wednesday, June 13, 2018

Node restart improvements in MySQL Cluster 7.6

A simple test to see how much faster restarts are in 7.6 compared to 7.5 is to
load a set of DBT2 warehouses and next perform a node restart of one of the
data nodes.

In the test we load 600 warehouses, this means about 60 GByte of data inserted
into NDB in each data node.

A node restart of 7.5.10 using this data set takes 19 minutes and 11 seconds.
In 7.6.6 this restart takes 6 minutes and 49 seconds.

The difference can vary, especially since the restart time in 7.5 has a fairly high
variance. The restart time in 7.6.6 has much smaller variance.

We will look into the different restart phases and see how those are affected.

The first phase is to allocate the memory and ensure that the memory is allocated
to our process by touching the memory. This phase roughly allocates 2.7 GByte
per second. In this case it takes 26 seconds in both 7.5 and in 7.6.

After that we have some preparatory steps, these normally are very quick and in
this takes about 1-2 seconds. These steps involve preparing the metadata to use
in the restart.

The next step is the restore phase where we recover data from a local checkpoint.
Given the introduction of Partial LCP in 7.6 this phase actually takes longer time
since there is some duplication in the restore where one row is restored multiple
times. This phase will go faster through setting RecoveryWork to a smaller
number, but this will instead increase work on checkpointing during normal
operation.

In this test 7.5 took 1 minute and 51 seconds to restore and 7.6.6 used 2 minutes
and 15 seconds.

The next phase is the REDO log execution phase. Since partial LCP means that
LCPs are shorter, this phase is shortened. This particular benchmark actually
doesn't shorten as much as could be the case since there is heavy insert activity
right before the restart. But it still decreased from 51 seconds in 7.5 to 35 seconds
in 7.6.

After this we execute the UNDO log, this particular benchmark has no disk data.
So both 7.5 and 7.6 takes no time in this phase. But this phase has improved by
5 times with 4 LDM threads and this benchmark uses 8 LDM threads. So this
phase is dramatically faster in 7.6 if used.

The next phase is the rebuilding of the ordered indexes. This phase executes the
same code as in 7.5, but we have changed default configuration to ensure that
the rebuild use fully parallelised rebuild. This means that we have 16 CPUs
working on the rebuild instead of 8 CPUs as in 7.5. This gives a rebuild time
of 1 minute and 17 seconds in 7.6.6 compared to 2 minutes and 4 seconds. The
reason it isn't twice is since we make use of hyperthreading to speed things up.

The copy fragment phase is more or less empty in both 7.5 and 7.6 since we
didn't perform any updates during the restart. We don't expect any major
differences in this phase between 7.5 and 7.6.

Next we come to biggest gain in restart times in 7.6. This is the phase where
we wait for the local checkpoint to complete. In 7.5 we have to wait between
1 and 2 checkpoint times. In 7.5 in this benchmark the checkpoint takes about
11 minutes although we have increased the disk write speed compared to the
default. In this execution 7.5 took 13 minutes and 48 seconds in this phase.

In 7.6 we execute a checkpoint that is local to the starting node. This takes
2 minutes and 4 seconds. Finally we participate in a checkpoint where all
nodes participate. This is now extremely fast, it takes only 4 seconds since
no activity is ongoing. So the total time for this phase 2 minutes and 8 seconds.

This wait for checkpoint phase is fairly constant in time in 7.6, in 7.5 it grows
with a growing database size and is dependent on the disk write speed we have
configured. Thus the gain in restart time in 7.6 is a bit variable, but this
experiment have been done with fairly conservative numbers.

The final restart phase is handing over the responsibility of event handling
between connected MySQL servers, this phase takes 5-6 seconds in both
7.5 and 7.6. It can take longer if some MySQL Server is down during the
restart.

As can be seen we have significantly brought down the restart times for a
size that is common in 7.5, with larger sizes the difference is much bigger.

No comments: