Thursday, August 09, 2018

Optimising scan filter for checkpoints in NDB

When loading massive amounts of data into NDB when testing the new
adaptive checkpoint speed I noted that checkpoints slowed down as the
database size grew.

I could note in debug logs that the amount of checkpoint writes was
dropping significantly at times. After some investigation I discovered
the root cause.

The checkpoint algorithm in NDB requires all changed rows to be written
to the checkpoint even if it is not a part that is fully checkpointed.
This means that each row has to be scanned to discover if it has been
written.

When loading 600 GByte of DBT2 data we have more than two billion rows
in the database. Scanning two billion rows takes around 15-20 seconds
when simultaneously handling lots of inserts.

This slowed down checkpoints and in addition it uses a lot of CPU.
Thus we wanted a more efficient scanning algorithm in this case.

The solution is based on dividing the database into larger segments.
When updating a row, one has to ensure that a flag on the larger
segment is also updated. A simple first approach is to implement
this on page level for our fixed size pages. Every row has an entry
in the fixed size. This part contains the row header and all fixed
size columns that are not defined as using DYNAMIC storage.

In DBT2 this means that most fixed size pages have around 300 row
entries. Thus we can check one page and if no row has been changed
we can skip checking 300 row entries.

When data size grows to TBytes and we checkpoint every 10-20 seconds,
the risk of a row in a page being updated is actually fairly low.
Thus this simple optimisation brings down the slowdown of the
checkpoints to small parts of a second.

Obviously it is possible to use smaller regions and also larger regions
to control this if required.

This is an important improvement of the checkpointing in
MySQL Cluster 7.6.7.

No comments: