Friday, October 14, 2005

Default Partitioning of MySQL Cluster 5.1

In MySQL Cluster version 4.1 and 5.0 tables are automatically partitioned
using a hash function on the primary key as partition function. If there is
no primary key there is a hidden key introduced and we use hash function
on that hidden key as the partition function.

The number of partitions is also selected automatically and also the placement
of the partitions. The number of partitions is the same as the number of nodes
except when MAX_ROWS has been set really high in which case it can be
n*number of nodes where n <= 8.

Assuming that we have 4 nodes in the cluster with 2 replicas. In this case we
have two node groups with 2 nodes in each.

The first partition will be placed in node group 0 with the first node in the node
group as primary, the second partition will be placed in node group 1 with the
first node in the node group as primary. The third partition will be placed in
node group 0 again and this with the second node in the node group as primary
and finally the fourth partition is placed in node group 1 again with the second
node in the node group.

This automatic partitioning is designed to spread data and load evenly on the
cluster nodes and a homogenous cluster is assumed.

In MySQL 5.1 we are introducing manual partitioning. One of the things one
can do with partitioned tables is that one can add, drop, reorganise partitions.
To be able to do this also on tables defined by ENGINE=NDB without
partitioning we decided that in 5.1 all clustered tables are partitioned and that
their partition can change.

Thus when you define a table as ENGINE=NDB it will automatically be a
partitioned table defined as:
PARTITION BY KEY()
KEY() here refers to either primary key partitioning or hidden key partitioning
as defined above (for partitioning on other engines that don't have partitioning
it is defined on partitioning on the primary key fields).
The partitioning above then applies on the number of partitions and their
placement.

Later on it will be possible to execute (the discussion here applies to
partitioning in general):
ALTER TABLE t1 COALESCE PARTITION 1;
This will merge the last partition into the remaining partitions. Assume the
automatic partitioning at table creation created 4 partitions. After this
command the number of partitions is 3. This is no longer the default number
of partitions, thus the partition definition is changed to:
PARTITION BY KEY()
PARTITIONS 3

Later on we execute:
ALTER TABLE t1 REORGANIZE PARTITION (p0) INTO
(PARTITION x0 NODEGROUP 1);
The name p0 is the default name of the first partition (followed by p1, p2, ...).
In this we have decided to specifically move the first partition to the second
node group. Thus there is no longer any default partitioning and after this
change the partition is completely manual and specified as:
PARTITION BY KEY()
PARTITIONS 3
(PARTITION x0 NODEGROUP 1,
PARTITION p1 NODEGROUP 1,
PARTITION p2 NODEGROUP 0)

So one can either have completely automatic partitioning of the table, one can
have a defined number of partitions but all the partitions have default setting
and finally one can have a completely manual partitioning.

3 comments:

Timo said...

I read through that, and am not quite sure if it solves the one thing I'd really like from partitioning. I want to be able to specify which column to balance the data by. The reason for this, is I want all data related to one user on one server. Take my Messaging system as an example. I run alot of queries like "SELECT * FROM msgs WHERE userid = ?", and have an index on userid. Why not guarantee that all of that data is on one server, to reduce the load on the other servers and to reduce the number of connections. At the moment, since it isn't on the primary key (an auto_increment which is unrelated to the userid), that query would use all servers. The ability to add servers/partitions without taking the database down would also be great for scalability.

Johan Andersson said...

Mikael, this information is great!

-johan

Anonymous said...

hello, Mikael Ronstrom:
I am a graduated student of china.
When i debug mysql source code ,I have many questions about mysql ndb cluster partition and query paralle. I want to communicate with you quickly.Can you give me your email.
my email:fzb0717@sohu.com