Tuesday, May 01, 2018

Linux configuration for MySQL Cluster

NDB Cluster was designed from the ground up for real-time operations.
It has its origins in the telecom industry where predictability of performance
and latency is absolutely critical. In addition the telecom vendors are
competing very much on pricing of their product.

This leads to that it is important to get the most performance from each
telecom server. Increasing the performance by 10% means that you need 10%
less network equipment. If many servers are shipped this can be a substantial
cost that is worth spending valuable engineering time to achieve.

Thus if you are using or are planning to MySQL Cluster on many servers it
is a good idea to spend some time ensuring that one achieves optimal
performance. If you are operating MySQL Cluster in just a few servers the
payback on your time investment might not be as high.

Another reason to take a deep interest in performance of MySQL Cluster is
of course that you want to learn more about Linux and MySQL Cluster.

In this blog post I will mainly discuss one specific optimisation that can have
large impact on your performance. This is placing the execution threads of
MySQL Cluster and Linux on to the proper CPUs.

In my book MySQL Cluster 7.5 inside and out I devote 3 chapters to this
topic. One chapter on the usage of hyperthreading in data nodes, one
chapter on configuration of CPU locking for data nodes in NDB and
one chapter on configuring Linux interrupt handling.

In this blog post I will mainly discuss the Linux configuration.

NDB is a distributed DBMS where all operations pass through the network,
thus network operations is an essential part of the setup of a MySQL Cluster.

So how does a network packet arrive from the wire into MySQL Cluster and
out again.

The first thing that happens is that the network card receives a packet. It now
needs to inform a CPU about this packet such that Linux TCP/IP code can
handle the packet.

Now the Linux TCP/IP is divided into 3 parts. The first part is handling the
HW interrupt from the network card, the second part is handling the
soft interrupt. The final part is a function call interrupt from the soft interrupt
to the device driver to handle most of the interrupt handling.

The function call interrupt part was introduced in Linux 2.6.35, thus in older
Linux kernels this concept doesn't exist.

Modern Linux kernels also use the NAPI mechanism, this means that the
HW interrupt is disabled during the time we process a set of HW interrupts.
This mechanisms avoids overloading the CPUs with interrupt handling and
takes care of interrupts in batches at high load.

Now Linux interrupt setup is highly configurable and the defaults depends
among other things on the Linux distribution.

There are three main areas that can be configured for Linux. These are
RSS (Receive Side Scaling), RPS (Receive Packet Steering) and
RFS (Receive Flow Steering).

RSS handles setup of the HW interrupt and the soft interrupt. Often the
default is to spread these interrupts on all CPUs. This is usually not a
good idea for MySQL Cluster since some of the CPUs we want to
protect as much as possible from all OS activity (these are in particular
the LDM threads).

RPS is the new mechanism introduced in Linux 2.6.35 and one can configure
the CPUs where the function call interrupts are handled.

Finally RFS tries to ensure that the function call interrupt is executed
on the same CPUs where the application calls recv from. In NDB data nodes
this happens in the recv threads. In a MySQL Server the communication
from the data nodes arrives in the NDB API receive threads (this can be
configured to be locked to a specific CPU). The communication from MySQL
clients arrive in the connection thread where they are executed, so these are
spread on all CPUs the MySQL server is executing on.

When the NDB data node has processed the messages arriving on the network
it will send some responses onto the network. In the NDB data nodes this
happens either in all the threads or it happens in a send thread or a combination
of the two.

Linux has the possibility to configure transmit interrupts as well through
something called XPS (Transmit Packet Steering). The optimal behaviour is
achieved if the transmit interrupts are handled by the same CPU as the send
call is done from.

Configuring Linux and MySQL Cluster together can have a major positive
impact on performance. To achieve the best performance it is important to
consider the flow of data through the servers to process the database requests
from NDB clients (e.g. the MySQL Server).