Wednesday, May 16, 2012

MySQL Cluster 7.2.7 achieves 1BN update transactions per minute

In MySQL Cluster there is a limiting factor in the receive threads that limits our update performance to about 0.5M update transactions per node group per second (usually 2 nodes per node group). In MySQL Cluster 7.2.7 we have removed most of this bottleneck and can now achieve 3x as many update transactions. We're reaching about 1.5M updates per node group per second. On a 30-node configuration we achieved 19.5M update transactions per second which corresponds to 1.17BN updates per minute. This means we achieve almost linear increase of update performance all the way to 30 data nodes.

The benchmarks were executed using the benchmark scripts dbt2-0.37.50 available at dev.mysql.com, the benchmark program is the flexAsynch program mentioned in some of my earlier blogs. We used 8 LQH threads per data node.

9 comments:

Robert Hodges said...

Congratulations on these numbers. MySQL cluster has some very creative adaptations for speed. I enjoy reading the team's blog posts about your work and hope your team can make it to some of the upcoming community conferences to talk about MySQL Cluster innovations.

Anonymous said...

Hi Mikael,

the improvments in NDB 7.2 are impressive.
Can you tell us how you do enable more than 4LQH on datanodes ? It is not written in the documentation, and using threadconfig, ndbmtd does not accept > 4 LQH with only 4 Redolog Parts

Thanks for the great work !!

Mikael Ronstrom said...

It's quite correct that NDB won't accept > 4 LQH's unless also there > 4 REDO Log parts. Each LQH thread needs at least one REDO log part.

Anonymous said...

Hi Mikael,

thanks for the complement of information. Is it possible to define more than 4 Redo Log Parts in actual 7.2.5 configuration ?

Thanks again for your help

Mikael Ronstrom said...

Yes, support for more than 4 log parts were introduced in MySQL Cluster 7.2.5

Anonymous said...

where can i download 7.2.7 version?

Jay Ward said...

Hey Mikael,

After reading your entry here and, especially, your entry "Challenges in reaching 1BN reads and updates per minute for MySQL Cluster 7.2", I was very excited about testing and implementing this latest version of cluster. I particularly liked the granular control over NDB threads provided by the ThreadConfig option. For testing, I installed on two Intel E5-1650s, and started it up. With only 12 cores, I gave it 4 LQHs, 1 send, 1 recv, 1 rep, 2 TCs, and had the main and IO running together; leaving two cores for the OS. I was amazed at it's stability and responsiveness. It absolutely blew away anything previous I had tested.

When I tried to do a backup, however, after writing the schemas to the CTL file, the first LQH thread tried to send a GSN_SCAN_FRAGREQ (353) signal to the second LQH thread. Since they aren't supposed to communicate, no JB exists and so the data nodes receive a SIGSEGV from the OS.

Did I miss some critical bit of info in implementing ThreadConfig?

Jay Ward said...

Hey Mikael,

After reading your entry here and, especially, your entry "Challenges in reaching 1BN reads and updates per minute for MySQL Cluster 7.2", I was very excited about testing and implementing this latest version of cluster. I particularly liked the granular control over NDB threads provided by the ThreadConfig option. For testing, I installed on two Intel E5-1650s, and started it up. With only 12 cores, I gave it 4 LQHs, 1 send, 1 recv, 1 rep, 2 TCs, and had the main and IO running together; leaving two cores for the OS. I was amazed at it's stability and responsiveness. It absolutely blew away anything previous I had tested.

When I tried to do a backup, however, after writing the schemas to the CTL file, the first LQH thread tried to send a GSN_SCAN_FRAGREQ (353) signal to the second LQH thread. Since they aren't supposed to communicate, no JB exists and so the data nodes receive a SIGSEGV from the OS.

Did I miss some critical bit of info in implementing ThreadConfig?

Mikael Ronstrom said...

Hi,
No, you missed nothing, you hit a bug. We also discovered this very recently and actually discussed the fix today :) A fix is in the works.