Mikael Ronstrom: Calculating parameters for local checkpoints in MySQL Cluster

There are a couple of parameters in the configuration of local checkpoints
that have an intimate relationship between each other. All the parameters
are documented individually but the following example can serve to
improve the understanding of how they can be used together.

The example uses an application that does
50.000 selects per hour
15.000 updates per hour
15.000 deletes per hour
and 15.000 inserts per hour

(LCP = Local Checkpoint)
What I usually do is that I decide to fix the time an LCP is to take. Usually I use
5 minutes as a "thumb-rule". This means that a system restart will take
maximum 2-3 minutes to execute the REDO log.

Then I take the maximum size of the data in a node. For simplicity one could
use DataMemory here.

Then I take DataMemory / 300 as the amount of data to checkpoint per
second. Say I had 2 GB DataMemory => 6.6MB/sec => 660 kB/100 msec =>
83 * 8 kB/100 msec.
The parameter to set for this is NoOfDiskPagesToDiskAfterRestartTUP.

If the data to checkpoint to disk is close to the disk bandwidth available I
would ease down on the 5 minutes above and go for a longer checkpoint
time. Remember that the disk is also needed to handle REDO log writes and
UNDO log writes.

Then a similar calculation for IndexMemory.

The parameter to set for this is NoOfDiskPagesToDiskAfterRestartACC.

After performing these calculations I have come to a position where I know
the LCP time. The next step is to go about calculating the number of
fragment log files.

The parameter to set for this is NoOfFragmentLogFiles.

The fragment log files (REDO log files) are needed to keep at least log records
during 3 LCP's. Since there is a number of uncertainties in for example whether
the disk will actually go at full configured speed always and so forth I usually
go for a conservative figure and estimating a size for 6 LCP's.

If I choose 300 secs (5 minutes) as the time of a LCP, this means that I need to
support writing log records at full speed for 6 * 300 secs = 1800 secs.

The size of a REDO log record is:
72 bytes + 4 bytes * number of fields updated + max size of all fields updated
There is one such record for each record updated in a transaction in each node
where the data resides.

Using your case above we get:
50.000 selects/hour => 0 log records since SELECT's are not REDO logged
15.000 deletes/hour => ~ 5 deletes per second = 5 * 72 = 360 bytes/second
15.000 updates/hour =>
~ 5 updates per second = 5 * 72 + 5 * 4 * 5 + 5 * 32 = 620 bytes/second
(Assumed here 5 fields of 32 bytes updated)
15.000 inserts/hour =>
~ 5 inserts per second = 5 * 72 + 5 * 4 * 40 + 5 * 32 * 40 = 7560 bytes/second
Assuming a table with 40 fields of 32 bytes each and no NULL's in insert.

Thus a total of 7560 + 620 + 360 = 8540 bytes per second

=> 8540 * 1800 = 15 372 000

Thus 16 MByte of REDO log file would suffice in this case. The minimum
setting on number of fragment log files is 3 (where each file is 64 MByte)
and thus the default setting of 8 is quite ok here and no need to change.

(With an application that does 15.000 updates/deletes/inserts per second
the figures obviously change quite a lot).

The UNDO log file I usually don't worry about at all, only in the sense that
there needs to be sufficient disk space for it. In this case a few Mbytes
should suffice, but given sizes of disks today I can't see why one would
allocate less than a GByte for it.

Mikael Ronstrom

Thursday, October 20, 2005

Calculating parameters for local checkpoints in MySQL Cluster

2 comments: