Wednesday, June 03, 2009

Some ideas on InnoDB kernel_mutex

I've noted that one reason that InnoDB can get difficulties
when there are many concurrent transactions in the MySQL Server
is that the lock time of the kernel_mutex often increases
linearly with the number of active transactions. One such
example is in trx_assign_read_view where each transaction
that does a consistent read creates a copy of the transaction
list to be able to deduce the read view of the transaction or
statement.

This means that each transaction is copied to the local transaction
list while holding the critical kernel_mutex.

Another such case is that most operations will set some kind of
intention lock on the table. This lock code will walk through
all locks on the table to check for compatible locks and the
first time it will even do so twice. Thus if all threads use the
same table (as they do in e.g. sysbench) then the number of locks
on the table will be more or less equal to the number of active
transactions.

Thus as an example when running with 256 threads compared to 16
threads the kernel_mutex lock will be held for 16 times longer
and possibly even more since with more contention the mutex is
needed for even longer time to start up waiting transactions.

So this is an obvious problem, so what is then the solution?
Not extremely easy but one thing one can do is to make the
kernel_mutex into a read-write lock instead of a mutex. Then
many threads can traverse those lists in parallel. It will
still block others needing write access to the kernel_mutex
but it should hopefully improve things.

Another solution that is also going to improve the problem is
to use thread pools. Thread pools ensure that not as many
threads are active at a time. However we still have a problem
that transactions can still be as many active in parallel as
there are connections (although InnoDB has a limit of 1024
concurrent active transactions). So the thread pool needs
to prioritize connections with active transactions in cases
where there are too many threads active at a time.

This type of load regulation is often used in telecom systems
where it is more important to give priority to those that have
already invested time in running the activity. Those that are
newcomer comes in when there are empty slots not taken by
already running activities.

1 comment:

midom said...

Mikael,

1024 transaction limit has origins in undo slots, so, you can have much more transactions (repeatable read snapshots), as long as they don't grab undo slots.

In our app, every connection establishes a transaction, so indeed, increasing costs are costly.