Thursday, October 04, 2012

My personal list of new features in MySQL 5.6

This blog post simply lists and describes a few of the most interesting new features in MySQL 5.6. The list is simply my preferences based on my development interest and what I think is very interesting feaures in general.

In MySQL we develop new features using worklogs, the development is divided into milestones and the MySQL 5.6 development was divided into 6 different milestones. In total there was around 200 worklogs developed in MySQL 5.6 which is a quite impressive set of new features. Naturally there is also a ton of bug fixes in MySQL 5.6 as well but I won't go into those here.

In this blog I will only mention 30-40 of those worklogs, but even this is a fairly large number of interesting features.

I will discuss features in areas of interest.

Partitioning

Let's start with partitioning. We developed partitioning originally as a 5.1 feature. Now development is mostly handled by Mattias Jonsson and we meet regularly to discuss new bugs and features in partitioning. MySQL 5.6 saw a couple of significant new partitioning features. With MySQL 5.6 it is now possible to import and export partitions to and from tables as a very quick operation (except when you import and want to check rows for correctness as part of the import). This means you can insert your data into a single table and then decide to move the table into a partitioned table after a while. This feature can be used for a variety of things.

We have also worked hard on improving performance with many partitions. We have fixed a number of performance issues in the algorithms, we have improved the locking part such that we only lock those partitions that are actually used in the query. This improvement required some reengineering effort in the MySQL Server which is beneficial to many other parts other than partitioning. Another thing we added is that it is now possible to specify in queries which partitions one want to use, thus partitions can in a sense be treated as tables in queries if need be.

Optimizer

The next area is the optimizer, many new things have happened here and I am not the main expert in this area, but even from my view the improvements in subquery handling, index condition pushdown, persistence of optimizer stats (both in server and in InnoDB), new algorithms for multi-range reads focused on disk sweeps and an improved sort function is very interesting and many of them gives extreme performance improvement for certain queries. A feature I also think is very important is the optimizer trace, this feature came up a while ago in a reengineering project and it's nice to see it completed now. It gives anyone the ability to understand exactly how the optimizer makes its decisions and thus also making it possible for both users and developers to understand how complex queries can be improved.

The new ability to use EXPLAIN also on UPDATE/DELETEs is obviously a very important addition to users of MySQL.

Replication

The next area is replication, it contains a large amount of functional features that will make it easier to use. There is a number of improvements that I consider very interesting, the first is the improved scalability of binlog writing which makes the scalability balanced such that we're scalable both for replicated and non-replicated scenarios. The introduction of GTIDs is also a very important feature that will make life much easier for developers of various sharding and HA solutions based on MySQL replication. Finally also of course parallel slave applier is another important addition to the scalability of MySQL replication. Many new MySQL 5.6 features makes it easier to use MySQL as a component in larger systems.

MySQL Server internals

In the MySQL Server I also of course take an interest in the split of LOCK_open and the work on metadata locks. These changes are very significant for MySQL Server scalability. There are also many new features in the performance schema area making it possible to monitor MySQL in numerous new ways not possible in the past. A major effort was also made in the area of on-line alter table changes. Now there is a API in the MySQL Server to make any change on-line if the storage engine supports it and InnoDB specifically supports on-line add index.

InnoDB

Given my background in developing MySQL Cluster, and my general interest in everything around performance, it's not surprising that most everything that happens in the InnoDB storage engine is also things I consider interesting.

InnoDB has developed extensively in MySQL 5.6 both in terms of scalability and functionality.

In the area of scalability InnoDB has seen improvements of:
- RW-locks for buffer pool hash
- Split of kernel mutex
- Multi-threaded purge
- Reduced contention on log mutex
- Increased max size of REDO log files
- Improvements to InnoDB thread concurrency feature
- Read-only transaction optimisations

Furthermore InnoDB also have new major features such as fulltext search, preloading of buffer pool, separate tablespaces for UNDO log, flexible page sizes, memcache support for InnoDB and transportable tablespaces.

Finally InnoDB has improved stability of performance through improved flushing and it has a new low-level monitoring (InnoDB metrics).

As a final thing we have also updated default configurations to make it even easier to manage MySQL Servers.

So the MySQL 5.6 is a release that will be very useful for both new and old MySQL users.

3 comments:

hingo said...

Hi Mikael

The other day I made a claim on a Facebook thread that 5.6 contains more new features than 5.0-5.5 combined. Of course, it's a completely baseless claim on my part. But since you are doing statistics, how many worklogs per release do you think 5.0, 5.1 and 5.5 had?

Mikael Ronstrom said...

MySQL 5.5 I might have some stats available on, but for 5.0 and 5.1 I think it would be a bit hard to find, possibly you can find this information through the Bazaar history since I think most worklog pushes has a reference to the worklog.

That other self said...

Hi Mikael. I have a question regarding partitioning locking which you touched on "we have improved the locking part such that we only lock those partitions that are actually used in the query." I was hoping that this meant we would see partition level optimize in 5.6, but I see in the latest changelogs for 5.6.9 the following. "a storage engine that does not support partition-level OPTIMIZE, such as InnoDB", which suggests the limitation is still in place. Could you clarify? Will we ever see partition-level optimize for innodb?

Cheers

//Steve Jackson