Tuesday, November 10, 2009

Improvement of MySQL partitioning included in MySQL's next milestone release tree

It was quite some time since I last blogged. It's not due to
inactivity. For those of you that have followed my blog might
have seen earlier blog posts about a new partitioning feature.

This new partitioning I first blogged about in July 2006 and
that blog is still the 3rd most popular blog of my blogs, even
when looking at the last months views. The work on this started
out in 2005 and so it's nice to now get it in a state where it's
quality is ready for more heavy testing. For those interested
in partitioning I think this feature will enlarge the number of
cases where partitioning is applicable. It's now possible to
partitioning on many more field types and also on multiple fields
in an efficient manner.

This feature described by WL#3352 has now been pushed
to the mysql-next-mr tree. For those of you new to our new
milestone release model this tree is where we push new features
before clone off. After clone off this tree is merged with
mysql-trunk tree and after about 6 months of bug fixing a Milestone
Release is performed. The current mysql-trunk tree is the tree from
which the current MySQL 5.4 releases are produced. A milestone
release is of beta quality and some milestone releases will be
continued towards a GA release. There will be a new milestone
release with about 3-6 months interval. For more information on
the release model see here.

The WL#3352 was pushed into the Milestone which has the
codename Betony in the article referred to above. The current
MySQL 5.4 release series is called Azalea (actually 5.4.0 and
5.4.1 belonged to Summit and 5.4.2 was the first Azalea release).

The major advantages of this new feature is that it makes it
possible to partition on string fields and also to partition on
multiple fields and still get good partition pruning. Previous
partitioning required a partition function that delivered an
integer result and a couple of functions could deliver good
partition pruning.

Now it is possible to partition on most fields and even a set
of them and always get good partition pruning.

The final result of the new syntax is the following:

CREATE TABLE t1 (a varchar(5) character set ucs2, b int)
PARTITION BY RANGE COLUMNS (a,b)
( PARTITION p0 VALUES LESS THAN (_ucs2 0x2020, 1),
PARTITION p1 VALUES LESS THAN (MAXVALUE, MAXVALUE));

So the keyword COLUMNS indicates that a list of fields is
used to partition on instead of a function. The new
partitioning applies to RANGE and LIST partitioning.

All the management functions for partitioning still applies.
However the major difference comes when you do a query like:

select * from t1 WHERE a > _ucs2 0x2020;

In this case the partition pruning will discover that only
the partition p1 is possible to find records and will thus
prune away partition p0. In MySQL 5.1 it's only possible
to perform pruning on intervals for single fields and
the partition function must also be of a type that is
safe to always increase such as YEAR or TO_DAYS (actually
this new feature also added the function TO_SECONDS to this
list of functions that can be pruned on efficiently).

So partition pruning on this works very much like an
index. Not surprisingly the partition pruning code
reuses the code for the range optimiser which looks
at what indexes can be used for a certain query.

If you wonder why I am using these _ucs2 constants
as examples it's because I had to learn a lot about the
character set code in MySQL to get everything right with
this feature. Actually even found and fixed a few MySQL
character set bugs in the process :)

3 comments:

Sheeri K. Cabral said...

Cool! Though as a logistical issue, I had to click through my RSS feed and read several paragraphs to actually find out what the

(in my opinon, blog posts should be like newspaper articles, and a good summary of what the article is about should appear above the "fold"....otherwise you lose people who don't want to click through because they don't know what the new partitioning feature is, and aren't THAT excited about it....but might be if they knew what it was!)


this could very well be the 2nd paragraph, instead of the 5th one:

"The major advantages of this new feature is that it makes it
possible to partition on string fields and also to partition on
multiple fields and still get good partition pruning. Previous
partitioning required a partition function that delivered an
integer result and a couple of functions could deliver good
partition pruning."

Mark Callaghan said...

I was skeptical about many of the new features in 5.1. Now I can't wait to use them.

I still want to use multiple engines in one partitioned table and that is probably something I will try to add.

Mikael Ronstrom said...

Sounds good that you want to try out multiple storage engines in one table. The reason why it's not supported is mainly the problems it creates in the handler interface. Just one example is the table and index flags. If those flags differ on different partitions you need to decide what to respond to the MySQL Server. There are many similar cases that one would have to consider how to resolve.