Tuesday, June 05, 2012

LOCK_open finally removed as a bottleneck in MySQL 5.6

From the very first day at MySQL, the LOCK_open mutex have been an infamous bottleneck. The road to removing it as a bottleneck has been a long one. Already in the MySQL 5.4 beta release we had some first improvements of the LOCK_open. Other preparations were done in the solution of the bug #901 with the introduction of the MDL locking (metadata locking). Finally some preparations was done in early MySQL 5.6 DMR's where LOCK_open was removed as a mutex for a lot of activities where it wasn't really needed in opening and closing of files.

During an internal meeting in Trondheim at the end of 2010 I sat down with Dmitry Lenev and we came up with a solution where the LOCK_open is removed as a bottleneck. So finally this year we finalised this development and made it available as part of the MySQL 5.6 june 2012 labs release.

LOCK_open was used to protect a number of data structures related to the TABLE_SHARE object that contains a number of attributes about the table. It was also used to protect the cache of TABLE objects and a number of data structures related to it.

So what to do in order to fix this bottleneck. The first analysis is that most queries only need read access to the TABLE_SHARE object and they need one instance of a TABLE object. So if we have a set of TABLE objects then this can be used in parallel and their is no reason to have global data structures protecting the caches of the TABLE instances.

So the first step to fixing LOCK_open is to introduce a new set of TABLE cache instances. Each such instance will contain a list of cached TABLE objects. Each such TABLE cache instance also have a reference to the TABLE_SHARE object. So this means that we can find both a TABLE_SHARE object and a TABLE object by using one of those TABLE cache instances.

So through these set of TABLE cache instances it isn't necessary to use the LOCK_open if there are free TABLE objects in the TABLE cache instance used. LOCK_open is still needed to create new TABLE and TABLE_SHARE objects, but these objects can be reused any number of times once it's created. When performing DDL operations it's necessary to lock both LOCK_open and all TABLE cache instances.

So this means that we have replaced the LOCK_open with a set of mutexes protecting one TABLE cache instance. The number of TABLE cache instances are configurable, we used 16 instances in the published benchmarks. We also added a number of statistics counters to how many accesses of TABLE cache hits and misses we have in a MySQL Server.

We have seen at least 70% improvement based on just adding this new feature and in some cases the gain can probably be even bigger. By using 2 tables in Sysbench we can grow performance even further which would not make any difference when the LOCK_open was still a bottleneck and with this we can get the performance impact of removing LOCK_open to be even larger than 100%.

3 comments:

Anonymous said...

Nice!

Can you detail a bit more
- which operations are typically affected by this lock
- which benchmark got a 70% improvement?

Anonymous said...

Anonymous,

This is a great improvement . one of the worst case is dropping a huge partition or a big table on a slow file system. A less seen isl scaleup issue when 100K queries all asking for a file descriptor in the various table caches.

Thanks Michael and your team for taking that topic to a source code implementation.

Mikael Ronstrom said...

The 70% speedup was a standard Sysbench RO and RW benchmark using 2 tables.

Given that each query in MySQL 5.5 grabs the LOCK_open once on start of query and once on end of query (actually once per table in the query). The operations that are affected are mainly queries that are short and where the server is hit by very many small queries. When the number of queries goes beyond 100k per second the lock becomes hotter and hotter. In previous generations of MySQL one could also get the problem very easily by setting the table cache to be too small.