Mikael Ronstrom: October 2011

Thursday, October 27, 2011

MySQL Thread Pool: Summary

A number of blogs have been written with the intent of describing how
the thread pool manages to solve the requirements of the thread pool.

These blogs are:
MySQL Thread Pool: Problem Definition
MySQL Thread Pool: Scalability Solution
MySQL Thread Pool: Limiting number of concurrent statement executions
Automated benchmark tool for DBT2, Sysbench and flexAsynch
MySQL Thread Pool: Limiting number of concurrent transactions
MySQL Thread Pool: When to use?
MySQL Thread Pool vs. Connection Pool
MySQL Thread Pool: Optimal configuration
MySQL Thread Pool: Benchmarking

There are some interesting discussions in the comments on the scalability solution blog
and on the blog about limiting number of concurrent statement executions
and finally also on the blog about when to use.

These discussions are around when to use it, what other features might be worth
considering and some remarks on the type of benchmarks that could be used to
evaluate solutions.

The requirements we had on the thread pool solution and the solutions were:
1) Split threads into groups individually handled to avoid making the
solution a problem in itself, aim is to manage one active thread per
group.

Solution:
Connections are put into a thread group at connect time by round robin.
Configurable number of thread groups. This ensures that the thread pool
itself isn't a scalability hog.

2) Wait for execution of a query until the MySQL Server has sufficient
CPU and memory resources to execute it.

Solution:
Each thread group tries to keep the number of executing queries to one or
zero. If a query is already executing in the thread group, put connection
in wait queue.

3) Prioritize queries on connections that have an ongoing transaction.

Solution:
Put waiting connections in high priority queue when a transaction is
already started on the connection.

4) Avoid deadlocks when queries are stalled or execute for a long time.

Solution:
Allow another query to execute when the executing query in the thread
group is declared as stalled (after a configurable time).

Wednesday, October 26, 2011

MySQL Thread Pool: Benchmarking

We have executed a number of benchmarks using the thread pool to
see how it operates in various workloads. A thorough study on this
can be found in Dimitri's blog here.

Optimal number of active connections is the number of active connections
needed to achieve the best throughput for the MySQL Server. For an InnoDB
workload this is usually around 32-128 active connections.

From all our benchmarks we've seen that the performance of the thread pool
when operated with less than the optimal number of active connections is
about 1-3% slower than without thread pool since the behaviour is the same
and the thread pool adds a little bit more overhead. More or less all of
this overhead is to handle KILL query correctly.

When operated in the region of the optimal number of active connections
the performance is very similar. We have seen though that the thread pool
benefits very much from locking the MySQL Server to a number of CPUs
equal to the setting of the thread_pool_size configuration parameter.
When not locked to CPUs the performance is similar, when locked to CPUs
the thread pool gives 10-15% higher performance when using the optimal
number of active connections. The MySQL Server operated without thread
pool and locked to CPUs have no significant change of throughput compared
to not locking to CPUs.

When operating above optimal number of connections the thread pool
provides a great benefit, we've seen numbers all the way up to 100x
better performance when operating with a few thousand concurrently
active connections.

MySQL Thread Pool: Information Schema Tables

The thread pool have three information schema tables. These are TP_THREAD_STATE,
TP_THREAD_GROUP_STATE and TP_THREAD_GROUP_STATS.

The TP_THREAD_STATE table contains one row per thread that is currently
maintained by the thread pool. This row contains only interesting information
if the thread is actively executing a statement. In this case it contains information
how many 10 milliseconds slots the query has consumed, if the thread is blocked by
some event, the event is listed. Both of those information items are current state
and will change for each new query.

The TP_THREAD_GROUP_STATE table contains one row per thread group. It contains
information about number of threads of various types. The first type is consumer
threads, this is a thread not used for the moment, at most 1 such thread will
exist at any point in time. This is the next thread to use if the current threads
used are not enough and a new thread is needed.

The second type of threads are reserved threads, these are also threads not currently
used. They will be used when there is no consumer thread and a new thread needs to be
started.

It contains information about the current number of connections handled in this thread
group. It contains current information about the number of queued low priority
statements (QUEUED_QUERIES) and queued high priority statements (QUEUED_TRANS).

It contains information about configuration, thus state of stall limit, priority
kickup timer, algorithm used. Also information about current number of threads in
the thread group, current number of threads actively executing a statement in the
thread group and current number of stalled statement executions.

Finally it contains some useful information about thread number of a possible
waiter thread (the thread that listens to incoming statements), information about
the oldest query that is still waiting to be executed.

The last table is the TP_THREAD_GROUP_STATS that contains statistics about the
thread group.

There are statistics about number of connections, number of connections closed,
number of queries executed, number of queries stalled, number of queries queued,
number of queries that was kicked up in priority from low priority to high priority.

There is also statistics on threads, how many threads have been started, how many
threads have become consumer threads, become reserve threads, become waiter threads.
How many times the thread that checks for stalled threads decided to start a thread
to handle the possibility of executing a query.

Finally there is statistics about each blocking event coming from the MySQL Server
(meta data locks, row locks, file IO, sleeps and so forth).

One of the most important information here is the number of stalled queries
(STALLED_QUERIES_EXECUTED in TP_THREAD_GROUP_STATS), this counter
gives a good idea if we have many stalled queries, if there are too many such
queries, it is a good indication that one should consider increasing the
thread_pool_stall_limit.

Another very important information is the number of priority kickups
(PRIO_KICKUPS in TP_THREAD_GROUP_STATS). If this counter
grows too quick it is an indication that the thread_pool_prio_kickup_timer
might need to be higher.

It might at times be important to check the number of threads started
(THREADS_STARTED in TP_THREAD_GROUP_STATS) as well.
If the threads are started too often, it's a good indicator that we should
not be so aggressive in stopping threads and thus set thread_pool_max_unused_threads
a bit higher.

The current oldest waiting query might also be a good idea to track to ensure that
we don't get longer waits than what is acceptable. If we get too long waits here,
one can either change some configuration variable, but it might also be an indicator
that the MySQL Server is constantly overloaded and that some action should be done
to remedy this.

Tuesday, October 25, 2011

MySQL Thread Pool: Optimal configuration

The thread pool plugin has a number of configuration parameters that will affect
its performance. These are documented in the MySQL manual here.

To configure the thread pool for optimal operation the most important parameter is
the --thread_pool_size. This parameter specifies the number of thread groups that
the thread pool will create.

The default value of thread_pool_size=16 is very often a good starting point. We have
seen that for InnoDB Read Workloads it is sometimes possible to achieve even better
results when it is set to around 30-40. For write intensive InnoDB workloads the
optimum can be in the range 12-30. MyISAM workloads usually have an optimum a bit
lower in the range of 6-8. The default value of 16 will work well also for most
MyISAM workloads.

The next parameter to consider for optimum operation is --thread_pool_stall_limit.
This is set to 6 (=60ms) by default. This number is set very low for good operation
in most cases. In most cases with workloads that don't have very many long queries
it is ok to set this much higher. Setting it to 100 (=1 second) should be ok in most
cases.

In the information schema one can see how many queries are stalled, if there are too
many queries stalled, then it is a good idea to increase this parameter since stalled
queries lead to increased context switching activity and more threads to manage for the
operating system.

The next parameter --thread_pool_prio_kickup_timer is set rather high to 1000
(=1 second). This setting should be ok for most cases, in extremely loaded environments
where thousands of connections want to execute at the same time it's necessary to
increase this variable to ensure that queries aren't moved too early. At the same time
setting it too high means that long-running transactions can block out short transactions
too much. But settings up to 10000 (=10 seconds) should in most cases be ok.

There is a parameter which isn't supported --thread_pool_algorithm. This parameter
makes it possible to use a bit more aggressive scheduling algorithm in the thread pool.
In most cases this has no benefits other than in some cases achieving better results.
It has been left accessible if someone wants to experiment with it and give us feedback
about it.

The last parameter is --thread_pool_max_unused_threads. This parameter specifies the
maximum amount of unused threads we will keep per thread group. It's possible to have
quite a few unused threads and to ensure that we give back memory to the operating
system one can use this parameter. By default it's 0 which means that threads are never
released and kept around for future use. Setting to a nonzero value means that the server
will use less memory but can also attribute to a higher CPU overhead to create new
threads again later on.

Monday, October 24, 2011

MySQL Thread Pool vs. Connection Pool

Given that thread and connections in the MySQL Server
have been so intertwined, it is easy to confuse the
purpose of the MySQL Thread Pool and the purpose of
a Connection Pool.

The aim of a Connection Pool is that the MySQL
clients should not be forced to constantly do connect and
disconnect. Thus it is possible to cache a connection in
the MySQL client when a user of the connection no longer
needs it. Thus another user that needs a connection to the
same MySQL Server can reuse this cached connection later on.

This saves execution time in both the client and the server.
It does however not change the dynamics of how many queries
are executed in parallel in the MySQL Server. This means that
the likelihood of too many concurrent queries to execute in
the MySQL Server is the same with or without a Connection
Pool.

Also a Connection Pool operates on the client side. This
means that it doesn't see the state of the MySQL Server when
deciding whether to send a query to the MySQL Server or not. Thus
it doesn't have the required information to decide whether to
queue a query or not. Only the MySQL Server have this information
and thus the MySQL Thread Pool has to operate in the MySQL Server.
It cannot perform its task on the client side.

Thus it is easy to see that the MySQL Thread Pool and a
Connection Pool are orthogonal and can be used independent of
each other.

MySQL Thread Pool: When to use?

The most important variable to monitor is threads_running. This
variable keeps track of the number of concurrent statements
currently executing in the MySQL Server.

If this variable has spikes that put it in a region where the
server won't operate optimally (usually going beyond 40 for
InnoDB workloads) and most particular if it goes well beyond
this into the hundreds or even thousands of concurrent
statements then the thread pool will be something beneficial
to protect the MySQL Server from troubles in overload
situations.

Another indicator that you will benefit from use of the thread
pool is when you already now use the --innodb-thread-concurrency
variable. This variable tries to solve a similar problem and the
thread pool solves it at a better place even before query execution
has started and also provides additional benefits.

Also if your workload is mainly short queries then the thread
pool will be beneficial, long queries isn't bad for the thread
pool but will decrease its positive impact.

Friday, October 21, 2011

MySQL Thread Pool: Limiting number of concurrent transactions

There are hot spots in the MySQL Server that become hotter when many
transactions are handled concurrently. This means that it is imperative
to avoid having too many concurrent transactions executing in parallel.

The thread pool handles this by prioritizing queued queries according
to whether they have already started executing a transaction or not.
It is also possible for the user to decide that a connection will be of
high priority independent of whether a transaction is started or not.

Such a prioritization can have issues with livelock if there are transactions
that are very long. To avoid this problem a query will be moved to the high
priority queue after a configurable time have expired. This time is set in the
configuration parameter --thread_pool_prio_kickup_timer
(number of milliseconds before a query is kicked up).

However to avoid too many movements in a short time, the thread pool will
at most move one query per 10 milliseconds per thread group.

It is possible for the user to define his connection as always being of
high priority to ensure queries from that connection always moves faster
through the server.

Automated benchmark tool for DBT2, Sysbench and flexAsynch

A new benchmark tool is available here to enable automated
benchmark runs of DBT2, Sysbench and flexAsynch using MySQL
and MySQL Cluster.

The benchmark tool is based on dbt2-0.37 for DBT2, sysbench-0.4.12
for sysbench benchmarks and a flexAsynch program that is available in
MySQL Cluster source releases (the version needed for the automated
flexAsynch tests requires an updated version of flexAsynch.cpp which
hasn't been included in a MySQL Cluster release yet, a blog post
notifying when it arrives will be written).

The automation scripts are part of the dbt2-0.37.50.tar.gz package.
This package is needed to run all benchmarks. In addition a gzipped
source or binary tarball of MySQL or MySQL Cluster is required to
run the benchmarks. Finally to run sysbench benchmarks one also needs
to download the sysbench-0.4.12.5 tarball.

So assuming you have downloaded all those tarballs, how does one
run a sysbench benchmark on your local machine?

The first step is to create a benchmark directory, I usually use
$HOME/bench or /data1/bench. In this directory create a directory
tarballs. Place all three tarballs in this directory. Go into this
directory and unpack the dbt2 tarball through tar xfz dbt2-0.37.50.tar.gz.
Then copy the benchmark start script into the $HOME/bench directory
through the command:
cp $HOME/bench/tarballs/dbt2-0.37.50/scripts/bench_prepare.sh $HOME/bench/.

Then copy the example configuration file in the same manner using the
command:
cp $HOME/bench/dbt2-0.37.50/examples/autobench.conf $HOME/bench/.

Edit the autobench.conf to be conformant to your file system environment.
The example configuration file assumes the use of /data1/bench as the
directory to use.

Now it is time to prepare to run a benchmark, create a directory under
$HOME/bench for the test run. So for example if you want to call it
test_sysbench then run the command:
mkdir $HOME/bench/test_sysbench

Next step is to copy the autobench.conf file into this directory and
edit it.
cd $HOME/bench
cp autobench.conf test_sysbench/.

Now there are two way to go about editing this configuration file. If
you want to go fast and unsafe, then go ahead and edit the file
directly, there is a fair amount of explanations of the various
parameters in this file. If you want more help, then read the
dbt2-0.37.50/README-AUTOMATED to get more directions about how to
set-up the configuration file properly.

Now everything is ready to run the benchmark, this is done through
the commands:
cd $HOME/bench
./bench_prepare.sh --default-directory $HOME/bench/test_sysbench

If you want to follow the development of the benchmark in real-time
you can do this by issuing tail -f on the proper file. For sysbench
RO benchmarks there will be a file called
$HOME/bench/test_sysbench/sysbench_results/oltp_complex_ro_1.res
for the first test run (you can tell the benchmark to do several
runs). Do tail -f on this while the benchmark is running and you'll
get printouts from the sysbench program written on your console.
Among other things you'll get a string like this:
Intermediate results: 128 threads, 3564 tps
if the current running test uses 128 concurrent connections. By
default the intermediate results are reported every 3 seconds.

The final result will be reported in the file
$HOME/bench/test_sysbench/final_result.txt

An additional note is that if you want to test the flexAsynch
tests, then it is necessary to use a source tarball of the
MySQL Cluster 7.x series. This is simply because the flexAsynch
program isn't distributed in binary tarballs.

The benchmark script will take care of the build process for all
source tarballs, all important parameters you need to handle
is part of the autobench.conf script. You will however need to
install the proper compilers and build tools to enable builds of
MySQL, sysbench and DBT2 programs.

If you want to benchmark a MySQL Server using the thread pool, then
it is necessary to download a MySQL Enterprise Edition of the MySQL
Server. If you already have a commercial license with Oracle, then
simply use this to download the MySQL binary tarball through the
edelivery.oracle.com. If you don't have a commercial license, you
can use the Oracle Software Delivery Cloud Trial License Agreement
which gives you a 30-day trial license. So to get the binary tarball
go to edelivery.oracle.com, register if necessary, log in, answer
all required license agreements.

Next step is to select MySQL Database as the Product Pack and as
Platform select Linux x86-64. Finally download the TAR file for
generic Linux2.6 x86-64 platforms. When this download is completed
then unzip the file and you'll get the gzipped tarball you need
to run the thread pool benchmark.
Linux x86_64.

The sysbench contains a few extra features compared to the
sysbench-0.4.12 version. It contains support for intermediate
result reporting, support for multiple tables in the sysbench
benchmark, support for partitioned tables, support for using
secondary indexes, support for using HANDLER statements instead
of SELECT statements, and also support for running sysbench at
fixed transaction rates with a certain jitter.

DBT2 can in addition to running with a single MySQL Server also
run with multiple MySQL Servers when used with MySQL Cluster.
It contains a few new features here to control partitioning,
possibility to place the ITEM table in each MySQL Server and
so forth.

All scripts and many programs have updated parameters and all
scripts have extensive help outputs to make it easy to understand
what they can do.

It is fairly easy to extend the benchmark scripts. As an example
if you need to change a parameter which isn't included in
autobench.conf then first add it in bench_prepare.sh and then
add code to handle it in start_ndb.sh. Also update the
autobench.conf example file if you want to keep the feature
for a longer time. If you want to suggest changes to the
scripts please report it in My Oracle Support (support.oracle.com)
or in bugs.mysql.com and assign it to Mikael Ronstrom.

MySQL Thread Pool: Limiting number of concurrent statement executions

The main task of the thread pool is to limit the number of
concurrent statement executions. The thread pool achieves
this by trying to always operate a thread group such that
only one or zero queries are concurrently executed per
thread group.

There is however a livelock issue to consider. A long-running
query in a thread group will in this manner block out all
other queries in this thread group until the query is completed.

To resolve this issue there will be a configurable timer that
decides when a statement execution is declared as stalled. When
a query is declared as stalled, it is allowed to continue
executing until completed. The thread group will handle the
connection as stalled and not count it as an active connection.
Thus new queries can be executed in the thread group again when a
query have been declared as stalled.

Another issue is when a statement execution is blocked for some
reason. Queries can be blocked e.g. by Row Locks, File IO, Table
Locks, Global Read Locks and so forth. If it is likely that the
blockage will continue for at least a millisecond or so, then it
makes sense to start up another statement execution in the thread
group to ensure that we continue to keep the number of concurrent
active connections at the right level.

To enable this the MySQL Server will make callbacks to the thread
pool stating when a block begins and when it ends. The thread
pool will use this to keep track of number of active statement
executions and this is used to decide when to start a new query
and when to allow an incoming query to start.

It is important that the wait is sufficiently long since it is
necessary to immediately continue executing the query when the
blockage ends.

MySQL Thread Pool: Scalability solution

When implementing a thread pool or any other means of limiting concurrency in the MySQL Server, careful thought is required about how to divide the problem to ensure that we don't create any unnecessary new hot spots. It is very easy to make a design that manages all connections and threads in one pool. This design does however very quickly run into scalability issues due to the need to lock the common data structures every time a connection or thread needs to change its state.

To avoid this issue we decided to implement the thread pool using a set of thread groups. Each of those thread groups are independent of the other thread groups. Each thread group manages a set of connections and threads. It also handles a set of queues and other data structures required to implement the thread group operations. Each thread group will contain a minimum of one thread, connections will be bound to a thread group at connect time using a simple round robin assignment. The thread pool aims to ensure that each thread group either has zero or one thread actively executing a statement. This means that the interactions between threads within one thread group is extremely limited. Also the interactions won't grow as the MySQL Server gets more statements to process. Thus it is very hard to see this model become a scalability issue in itself.

So we solved the scalability problem using a Divide-and-Conquer technique.

Tuesday, October 18, 2011

MySQL Thread Pool: Problem Definition

A new thread pool plugin is now a part of the MySQL Enterprise Edition.
In this blog we will cover the problem that the thread pool is solving
and some high-level description of how it solves this problem.

In the traditional MySQL server model there is a one-to-one mapping between
thread and connection. Even the MySQL server has lots of code where thread
or some abbreviation of thread is actually representing a connection.
Obviously this mapping has served MySQL very well over the years, but there
are some cases where this model don't work so well.

One such case is where there are much more connections executing queries
simultaneously compared to the number of CPUs available in the server. The
MySQL Server also have scalability bottlenecks where performance suffers
when too many connections execute in parallel.

So effectively there are two reasons that can make performance suffer in
the original MySQL Server model.

The first is that many connections executing in parallel means that the
amount of data that the CPUs work on increases. This will decrease the
CPU cache hit rates. Lowering the CPU cache hit rate can have a significant
negative impact on server performance. Actually in some cases the amount
of memory allocated by the connections executing in parallel could at times
even supersede the memory available in the server. In this case we enter a
state called swapping which is very detrimental to performance.

The second problem is that the number of parallel queries and transactions
can have a negative impact on the throughput through the "critical sections"
of the MySQL Server (critical section is where mutexes are applied to
ensure only one CPU changes a certain data structure at a time, when such
a critical section becomes a scalability problem we call it a hot spot).
Statements that writes are more affected since they use more critical
sections.

Neither of those problems can be solved in the operating system scheduler.
However there are some operating systems that have attempted solving this
problem for generic applications on a higher level in the operating system.

Both of those problems have the impact that performance suffers more and
more as the number of statements executed in parallel increases.

In addition there are hot spots where the mutex is held for a longer time
when many concurrent statements and/or transactions are executed in
parallel. One such example is the transaction list in InnoDB where each
transaction is listed in a linked list. Thus when the number of concurrent
transactions increases the time to scan the list increases and the time
holding the lock increases and thus the hot spot becomes even hotter
as the concurrency increases.

Current solutions to these issues exist in InnoDB through use of the
configuration parameter --innodb-thread-concurrency. When this parameter
is set to a nonzero value, this indicates how many threads are
able to run through InnoDB code concurrently. This solution have its
use cases where it works well. It does however have the drawback that
the solution itself contains a hot spot that limits the MySQL server
scalability. It does also not contain any solution to limiting the
number of concurrent transactions.

In a previous alpha version of the MySQL Server (MySQL 6.0) a thread
pool was developed. This thread pool solved the problem with limiting
the number of concurrent threads executing. It did nothing to solve
the problem with limiting the number of concurrent transactions.
It was also a scalability bottleneck in itself. Finally it didn't
solve all issues regarding long queries and blocked queries.
This made it possible for the MySQL Server to become completely
blocked.

When developing the thread pool extension now available in the MySQL
Enterprise Edition we decided to start from a clean plate with the
following requirements:

1) Limit the number of concurrently executing statements to ensure
that each statement execution has sufficient CPU and memory resources
to fulfill its task.

2) Split threads and connection into thread groups that are
independently managed. This is to ensure that the thread pool
plugin itself doesn't become a scalability bottleneck. The
aim is that each thread group has one or zero active threads
at any point in time.

3) Limit the number of concurrently executing transactions
through prioritizing queued connections dependent on if
they have started a transaction or not.

4) Avoid deadlocks when a statement execution becomes long or
when the statement is blocked for some reason for an extended
time.

If you are interested in knowing more details of how the new
thread pool solves these requirements there will be a
webinar on Thursday 20 Oct 2011 at 9.00 PDT. Check here
for details on how to access it.

If you want to try out the thread pool go here.

Mikael Ronstrom