I got time to spend on a really old worklog I completed coding
already october 2005. I blogged about it in July 2006 and
interestingly enough it's still the second most read blog
entry on my blog (probably related to search engines in some
way).
I have merged it with the azalea tree (this is an internal code
name for our development tree, name is likely to change). This
tree contains subquery optimisations, Batched join and some more
optimisations.
I have fixed a whole bunch of bugs that always shows up in early
code. The code quality is still alpha but at least you won't find
10 bugs per hour :)
Here you can find the launch pad tree for this code.
There are two important additions made possible by this tree.
1) New function to_seconds that is recognized by range optimiser
to enable partition pruning when partitioning like:
partition by range (to_seconds(time))
2) New partitioning functionality that makes it possible to
perform partition pruning over multiple fields.
Most of the bugs I have fixed had to do with this partition pruning
of multiple fields. The routine to discover which partitions are
needed is called find_used_partitions (in sql/opt_range.cc) and this
function is called recursively over a key tree. A key tree can be
very complex and more or less have AND of key parts using next_key_part
pointer and OR condition using left and right pointers. These left and
right pointers can however show up a little here and there in the tree
so one has to be very careful about how variables are assigned, saved
and restored. I havent' worked so much with recursive functions so this
is an interesting adventure.
Here's my latest addition of a test case to give you an idea of how it
works and also what works right now.
create table t1 (a int, b char(10), c varchar(5), d int)
partition by range column_list(a,b,c)
subpartition by key (c,d)
subpartitions 3
( partition p0 values less than (column_list(1,'abc','abc')),
partition p1 values less than (column_list(2,'abc','abc')),
partition p2 values less than (column_list(3,'abc','abc')),
partition p3 values less than (column_list(4,'abc','abc')));
insert into t1 values (1,'a','b',1),(2,'a','b',2),(3,'a','b',3);
insert into t1 values (1,'b','c',1),(2,'b','c',2),(3,'b','c',3);
insert into t1 values (1,'c','d',1),(2,'c','d',2),(3,'c','d',3);
insert into t1 values (1,'d','e',1),(2,'d','e',2),(3,'d','e',3);
select * from t1 where (a = 1 AND b < 'd' AND (c = 'b' OR (c = 'c' AND d = 1)) OR
(a = 1 AND b >= 'a' AND (c = 'c' OR (c = 'd' AND d = 2))));
So in the above select statement we are performing partition
pruning over 3 fields and subpartition pruning over 2 fields
and there are 5 different ranges in the query.
So please go ahead and try this new tree out and see if it
works for you.
My name is Mikael Ronstrom and I work for Hopsworks AB as Head of Data. I also assist companies working with NDB Cluster as self-employed consultant. I am a member of The Church of Jesus Christ of Latter Day Saints. I can be contacted at mikael dot ronstrom at gmail dot com for NDB consultancy services. The statements and opinions expressed on this blog are my own and do not necessarily represent those of Hopsworks AB.
Showing posts with label Partititon. Show all posts
Showing posts with label Partititon. Show all posts
Tuesday, August 04, 2009
Thursday, October 30, 2008
CACHE INDEX per partition for MyISAM
The newest development in the partitioning code
is WL#4571.
This new feature makes it possible to tie a
partition using MyISAM to a specific cache index.
The syntax for how to do is available in the
above worklog entry.
We found this feature to be useful for enabling
higher performance of parallel ALTER TABLE
(WL#2550). When adding
a primary key to a MyISAM table the key cache in
MyISAM limited scalability of Parallel ALTER TABLE
severely, so adding several key caches, essentially
one per partition we can ensure that the ALTER TABLE
can be fully parallelised (all other ALTER TABLE
on MyISAM already scales perfectly).
We also have some ideas on how to solve the base
problem in how to make the key cache more scalable
by dividing the mutex on the key cache into one
mutex per a range of key cache pages.
is WL#4571.
This new feature makes it possible to tie a
partition using MyISAM to a specific cache index.
The syntax for how to do is available in the
above worklog entry.
We found this feature to be useful for enabling
higher performance of parallel ALTER TABLE
(WL#2550). When adding
a primary key to a MyISAM table the key cache in
MyISAM limited scalability of Parallel ALTER TABLE
severely, so adding several key caches, essentially
one per partition we can ensure that the ALTER TABLE
can be fully parallelised (all other ALTER TABLE
on MyISAM already scales perfectly).
We also have some ideas on how to solve the base
problem in how to make the key cache more scalable
by dividing the mutex on the key cache into one
mutex per a range of key cache pages.
New launchpad tree for PARTITION BY RANGE COLUMN_LIST(a,b)
A colleague of mine at Sun/MySQL showed me how to get
statistics from my blog. This was an interesting read
of all statistics. I noted that there was a great
interest in partitioning related information and that
the new partitioning feature mentioned in my blog
2 years ago still attracts a lot of attention.
So I thought it was a good idea to blog a bit more
about what's going on in the partitioning
development. I decided to check out how easy it is
to externalize my development trees on launchpad.
It turned out to be really easy so I simply
put up the development tree for the new partitioning
feature which I described in my last blog.
Launchpad tree
I also externalized the Worklog entry for this
development, unfortunately not a very long
description but I'll try to work on that.
There is a new test case in the mysql-test/t
directory called partition_column.test which
shows how to use these new features (it might
take some time before this link works).
Worklog description
statistics from my blog. This was an interesting read
of all statistics. I noted that there was a great
interest in partitioning related information and that
the new partitioning feature mentioned in my blog
2 years ago still attracts a lot of attention.
So I thought it was a good idea to blog a bit more
about what's going on in the partitioning
development. I decided to check out how easy it is
to externalize my development trees on launchpad.
It turned out to be really easy so I simply
put up the development tree for the new partitioning
feature which I described in my last blog.
Launchpad tree
I also externalized the Worklog entry for this
development, unfortunately not a very long
description but I'll try to work on that.
There is a new test case in the mysql-test/t
directory called partition_column.test which
shows how to use these new features (it might
take some time before this link works).
Worklog description
Subscribe to:
Posts (Atom)