Saturday, March 04, 2006

Partitioning at UC 2006

I've been posted as a speaker at the MySQL Users Conference at the end of
I will present the partitioning development that has happened in MySQL for
5.1. There's been a lot of work to accomplish this but it's nice to see the
result now that it's nearing completion.

At first when I started this development my main objective of doing this
was to prepare MySQL for parallel query support (which it actually has
support for since 5.0 in MySQL Cluster).

However while developing support for this I've learned a lot about other
nice uses for partitioning. Other open source DBMS's have focused mainly
on the performance aspects of partitioning. I think this has been solved
very well in this 5.1 release, this particular part needed some intricate
understanding of the optimiser which I did not possess on a sufficiently
detailed level, but with some aid and with a small effort the optimiser
was turned into a partitioning optimiser.

However most of the work for partitioning has actually been spent on the
management part of partitioning. The ability to add, drop and reorganize
partitions in a fast and efficient manner. Now with the implementation it
is possible to momentarily add and drop partitions for range and list
partitioning. It is also possible to add and drop partitions for hash
partitions, normally this means that the entire table needs reorganisation.
However we've added the keyword LINEAR to all hashing variants. This
makes it possible also for HASH/KEY partitions to have partitions added
and dropped with minimal work. So if the table contains 16 partitions with
LINEAR KEY partitioning it's only necessary to reorganise 4 of those 16
partitions to grow the number of partitions to 20. The linear scheme makes
distribution of records to be unbalanced in scenario's where the number
of partitions isn't in the form 2**n. So it's a choice one does when
manageability is important.

Given that MySQL is known for its simplicity, ease of use, we felt that it was
more important to bring a version with the all the manageability and with
sufficent performance advantages. I'm glad to say now that we achieved
both. It's got both manageability and major performance improvements.
Unfortunately we weren't able to bring all the goals I had with partition
management for MySQL Cluster into 5.1. This is especially displeasing
to me being the founder of the cluster technology in MySQL. However we
look forward to bringing out this work in future releases of MySQL, it's
a tremendously interesting problem to reorganise a clustered table while
users are reading, updating, taking backup's and performing a whole lot
of various other actions on the table simultaneously with this change.
This work will also form the foundation of the support for adding and
dropping nodes in MySQL Cluster.

One idea that I'm particularly pleased with is the concept of REORGANIZE
partitions. I was contemplating how to handle MERGE of partitions, JOIN
of partitions and also simply changing the partition boundaries. I was
very pleased to find one stone for all those birds.

So please try out the new partitioning feature such that we can quickly
bring it to a stable state and move on to the next set of cool features for
versions to come.

No comments: