Monday, April 29, 2013

MySQL Architect at Oracle

I have worked as an architect in the MySQL/NDB world for more than 20 years and I am still working at Oracle and I like it here. Given all the FUD spread about MySQL I thought it might be a good idea to spread the word about all the great things we're doing to MySQL at Oracle.

#1 We are working on improving modularity in MySQL code base
In the early days of MySQL the MySQL development had serious issues with its development model. It was a model designed for a small code base. I used to work at Ericsson which is developing telecom switches that have systems with tens of millions lines of code. Such large systems require modularity. The Ericsson switches was developed with modularity built into the programming language already since the 70's. Even with this modularity a second level of modularity was required. The learnings from this reengineering project that span over more than a decade has given me valuable insights that can now be put to use for the development of the MySQL architecture.

When we're developing MySQL at Oracle we have a long-term view on the development, we've taken a lot of steps to make the code more modular. This means more work in developing a feature from a short-term point of view, but it pays back very quickly. As an example of this we did rearchitect the meta-data locking model in MySQL over a period of almost 5 years. Due to this rearchitecting effort we were able in MySQL 5.6 to split one of the crucial locks in the MySQL Server, the LOCK_open. This meant payback time as this improved our top performance of more than 100% with one small incremental step.

The principal idea is that new areas of development we try to put into separate modules which are as independent as possible from the rest of the MySQL Server code. Old code cannot be modularised in one swift move, this code has to be improved upon in small reengineering steps. Eventually those steps lead into more modularity and easier changability also of the old code base. We expect to continuosly improve the MySQL architecture in this manner.

Does this work benefit the entire MySQL community. Yes, modular code is the very foundation for successful open source code development also by the MySQL community. MySQL has a thriving development community, we expect it to thrive even more by the improved modularity we add to the MySQL code base.

#2 We have an improved development model
One of the first things I got engaged in 2007 when I was assigned to Senior MySQL Architect was to improve the MySQL development model. Previously we had a model where features were developed as projects and then pushed to the next release branch, often new features were pushed in several steps. This model had issues in quality and in development speed. This model often led to prioritisation issues when coming close to GA releases and often code was pushed very close to GA releases without proper quality. After MySQL was acquired by Sun and Oracle we have managed to get this under control using a new development model.

In the new development model we wanted to ensure that we took small steps forward (called milestone releases), always with retained quality. So in the new model we develop in 3-6 month steps, each step is quality assured and we use a lot of QA resources to ensure each milestone release is ok from a quality point of view. Some of these milestone releases are then taken out to become new GA releases. The GA releases are then put into our QA grind from where most bugs are squeezed out of the code.

We introduced this new model in 2009, it affected positively the MySQL 5.5 development and MySQL 5.6 has been completely developed according to this model. One only needs to look at what we achieved with MySQL 5.6 to understand that the new model works very well. We're continuing to use this model for new steps in the MySQL development.

The model is described in http://dev.mysql.com/doc/mysql-development-cycle/en/index.html

Our MySQL development is divided into a number of parts, we have a team taking care of InnoDB, another team taking care of MySQL Cluster (with the NDB storage engine), one team dealing with partitioning, an optimizer team, a replication team, a runtime team taking care of things such as metadata handling and other things related to query execution and a general team taking care of support code, networking, client code.

We also have teams handling MySQL Connectors, utilities and other things needed around the MySQL Server. There is also a performance and architecture team focusing on scalability, special projects and all sorts of improvements to ensure that our users get a good experience of using MySQL. Naturally we also have teams focusing entirely on the quality of the code. We have also a separate team working on MySQL Workbench and related tools to support MySQL management and development.

#3 We have great performance experts for the MySQL development
The community has lots of performance experts that understand how to help users in getting the best out of a MySQL Server. The community also has a number of people that are able to pinpoint performance issues in the MySQL Server. The patches these people develop are very useful in showing what can be improved in the MySQL codebase. The community is here very useful to the development of the MySQL Server. In this area we see a very good cooperation between our developers and the MySQL community developers.

We have also been able to do numerous performance improvements independent through the combination of MySQL experts and InnoDB experts that we have internally in Oracle. If you need proof of this just go to the MySQL 5.6 page and see demonstrated performance improvements we've done in MySQL compared to MySQL 5.5. This is an area I have contributed a lot to personally and I find it interesting how we managed to increase scaling of the MySQL Servers from 4 CPU threads to almost 64 CPU threads in just a few years.

#4 We continously develop MySQL partitioning
We continously improve scalability of our partitioning solution and integrate it further into the main storage engines as evidenced by the new set of partitioning features in MySQL 5.6.

#5 We have the most serious QA team in the MySQL world
Through the use of a large QA team we do our best to ensure that our community users are shielded from bugs in our released code. We also work with the community through milestone releases, lab releases and RC releases to ensure that our GA releases have top notch quality. We tripled the size of the QA teams as part of the MySQL 5.6 development.

Anyone can of course add new features to these MySQL trees, but it will be the community that mainly becomes the QA team for these additional features. We strive hard to ensure that our community users are shielded from being our QA team.

Although the MySQL is continously growing, the code coverage of our tests has continously improved going from MySQL 5.1 to MySQL 5.6.

#6 We are continously developing InnoDB inside Oracle
The major storage engine for MySQL is InnoDB, the InnoDB developers are also located within Oracle. We've seen many great benefits of being together inside the same company. This means that we can integrate InnoDB even better and improve scalability and functionality than ever before.

#7 We have a thriving MySQL Cluster development in Oracle
All MySQL Cluster development currently happens inside Oracle. I am still very much involved in this development and I sit next to a number of the key Cluster developers in my office in Stockholm with a beatiful view over the Stockholm City centre. We have shown magnitude improvements in scalability, performance per node and so forth in a regular and steady development. MySQL Cluster even has the capability to execute parallel queries as of version 7.2.

#8 We are making great efforts to make MySQL manageable
Many community developments are centered around getting more access to internal statistics about MySQL Server behaviour. Since a few years back we have a focused effort in this area to provide anything that a user could be interested in through the performance schema tables. These tables gather data internally and provide access to this statistical data through a set of performance schema tables. This effort is continuing, there is still more work to be done in this area.

#9 We have the largest and best optimizer team in the MySQL world
Our team of optimizer developers consists of developers that have loads of experience on developing DBMSs. Many of the developers have a background in developing SQL databases in the past and also have a scholar background which is almost a requirement to understand the fairly complex optimizer algorithms required in a relational DBMS. The team has made a tremendous effort in preparing a load of new optimizer features for MySQL 5.6.

#10 A thriving team working on MySQL Replication
The MySQL success have always been closely intertwined with the ability to replicate between MySQL servers using the master-slave model. The requirements on better failover capabilities and manageability of large sharded environments have led to an ever increasing set of new functionality in the replication area. We're continuing this development and working to provide even more functionality for the MySQL community that will be useful in building large MySQL installations.

#11 We have many other teams
The runtime team is doing a lot of work on handling metadata changes online, making the MySQL Server more scalable and making the server more modular.

The general team is ensuring that MySQL becomes more modular, ensuring that our support code is further developed, ensuring that we also have more unit tests on our code.

We have a thriving utilities team that is developing various tools useful for people managing MySQL Servers.

Each of the teams mentioned above have developed a great number of new features in the respective MySQL versions making it quite clear that MySQL development is thriving as never before.

So if you're looking for a MySQL version to use that have the best performance, the best stability and the most future-proof code base, then use MySQL 5.6.

The MySQL community continues to be an interesting place to work in. Oracle leads the development of new MySQL versions, we have a number of large companies that are building massive infrastructure components based on MySQL (Google, Facebook, Twitter and so forth) and we have a number of forks that also develops their own versions. There is also many new interesting requirements on MySQL to behave well also in a highly scalable environment and this requires even more work to be done on the MySQL replication architecture.

As a MySQL architect I am able to work with many parts of the MySQL engineering organisation and also the MySQL support organisation. There is a lot of interesting work going on that will continue to increase the use cases of MySQL for the MySQL community. So my work continues to be interesting and it's very rewarding to develop new functionality knowing that it will be used by many organisations in the world.

When I tell my kids or their friends about my work I tell them that I work on things that makes their computer games continue to tick since this is what most kids care about in the IT community. Personally I find it even more interesting the use cases for genealogy and other cool stuff.