Tuesday, May 14, 2013

MySQL Cluster 7.3 Improvements - Connection Thread Scalability

As many have noted we have released another milestone release of MySQL Cluster 7.3. One of the main features of 7.3 is obviously foreign keys. In this post I am going to describe one more feature added to MySQL Cluster in the second milestone release which is called Connection Thread Scalability.


Almost all software designed for multithreaded use cases in the 1990s have some sort of big kernel mutex, as a matter of a fact this is also true for some hyped new software written in this millenium and even in this decade. Linux had its big kernel mutex, InnoDB had its kernel mutex, MySQL server had its LOCK_open mutex. All these mutexes are characterized by the fact that these mutexes protects many things that often have no connection with each other. Most of these mutexes have been fixed by now, the Linux big kernel mutex is almost gone by now, the LOCK_open is more or less removed as a bottleneck, the InnoDB kernel mutex has been split into ten different mutexes.

In MySQL Cluster we have had two types of kernel mutexes. In the data nodes the "kernel mutex" was actually a single-thread execution model. This model was very efficient but limited scalability. In MySQL Cluster 7.0 we extended the single threaded data nodes to be able to run up to 8 threads in parallel instead. In MySQL Cluster 7.2 we extended this to enable support of up to 32 or more threads.

The "kernel mutex" in the NDB API we call the transporter mutex. This mutex meant that all communication from a certain process, using a specific API node, used one protected region that protected communication with all other data nodes. This mutex could in some cases be held for substantial times.

This has meant that there has been limitation on how much throughput can be processed using one API node. It has been possible to process much throughput
anyways using multiple API nodes per process (this is the configuration parameter ndb-cluster-connection-pool).

What we have done in MySQL Cluster 7.3 is that we have fixed this bottleneck. We have split the transporter mutex and replaced it with mutexes that protects sending to a specific data node, mutexes that protects receiving from a certain data node, mutexes that protect memory buffers and mutexes that protect execution on behalf of a certain NDB API connection.

This means a significant improvement of throughput per API node. If we run a benchmark with just one data node using the flexAsynch benchmark that handles around 300-400k transactions per second per API node, this improvement increases throughput by around 50%. A Sysbench benchmark for one data node is improved by a factor of 3.3x. Finally a DBT2 benchmark with one data node is improved by a factor of 7.5x.

The bottleneck for an API node is that only one thread can process incoming messages for one connection between an API node and a data node. For flexAsynch there is a lot of processing of messages per node connection, it's much smaller in Sysbench and even smaller in DBT2 and thus these benchmarks see a much higher improvement due to this new feature.

If we run with multiple data nodes the improvement increases even more since the connections to different data nodes from one API node are now more or less independent of each other.

The feature will improve performance of applications without any changes of the application, the changes are entirely done inside the NDB API and thus improve performance both of NDB API applications as well as MySQL applications.

It is still possible to use multiple API nodes for a certain client process. But the need to do so is much smaller and in many cases even removed.


vishnu rao said...

hi mikael,

does 7.3 work with mysql5.5?

i.e. is FK support there with 5.5 Mysql?


Mikael Ronstrom said...

MySQL Cluster 7.3 is based on MySQL 5.6 and 7.3 does contain FK support, FK support has existed for InnoDB for quite some time, so with MySQL Cluster 7.3 FK support is there for NDB and InnoDB, not for other storage engines, hope this helps

noemi said...

Hi Mikael, could you give me any information about benchmark environment ( #CPU, CPU Type, RAM)?

Mikael Ronstrom said...

Most of the benchmarks I run on a x86 server with 8 sockets with a 2.0 GHz Xeon processor with 6 cores per socket and 12 CPU threads per socket. There is 512 GB RAM in the machine.

Unknown said...

Hello Mikael,

Are the additions of the foreign key in anyway what helped increase the scalability with regards to the transactions per minute? I am trying to figure out if we would need to add in the foreign keys to fully recognize the increase in transactions or if all of the increase in the transactions from 7.2 to 7.3 are attributed to the mutex?

Thank you

Mikael Ronstrom said...

Hi Jeff,
The benchmarks have no foreign keys involved, so the reported differences are completely due to mutex improvements.

I am sure that foreign keys can affect performance as well, but this was not investigated in this benchmark.