Tuesday, September 09, 2008

Linear Scalability of MySQL Cluster using DBT2

To achieve linear scalability of MySQL Cluster using the DBT2
benchmark has been a goal of mine for a long time now. Last
week I finally found the last issue that limited the scalability.
As usual when you discovered the issue it was trivial (in this
case it was fixed by inserting 3 0's in the NDB handler code).

We can now achieve ~41k TPM on a 2-node cluster, ~81k on a
4-node cluster and ~159k TPM on a 8-node cluster giving roughly
97% improved performance by doubling number of nodes. So there
is nothing limiting us now from achieving all the way up to
1M TPM except lack of hardware :)

I've learned a lot about what affects scalability and what
affects performance of MySQL Cluster by performing those
experiments and I'll continue writing up those experiences on
my blog here. I have also uploaded a new DBT2 version where I
added a lot of new features to the DBT2, improved performance
of the benchmark itself and also ensured that running with many
parallel DBT2 drivers do still provide correct results when
adding the results together. It can be downloaded from


Unknown said...

That's awesome Mikael!!

hingo said...

Nice work Mikael, you've been running these DBT2 for quite a while now!

Is it the correct conclusion now that MySQL Cluster 6.4 (or 6.5) will be a kick ass OLTP, DW, younameit database solution now? Or is there some other issue like latency or so that we still need to tackle?

- No windows support. Corrected.
- No complex joins. Corrected adequately by BKA patch.
- Doesn't scale well with OLTP apps. Corrected!
- No Foreign keys. Ok so we still have this one for some time :-/

Mark Callaghan said...

@hingo - at last, someone admits that MySQL Cluster will soon be a high-end DW server. This is great news. When do we get aggregation pushdown?

I will guess that using Cluster is much less expensive than buying a SAN to get the storage throughput and availability required for such a server.

hingo said...

@Mark: I'm not admitting, I'm trying to get Mikael to admit it :-)

One addition to the list though: For large datawarehouses we would also need disk-based indexes, today indexes are still in main memory even if non-index columns can be disk based. For OLTP this is already great, but for a DW it could get kinda expensive if you'd need terabytes of RAM!

I hear disk based indexes are also on the roadmap, but not within the immediate next releases yet.

Matthew Montgomery said...


The FK constraint issue can be reasonably worked around by using triggers. They're a bit more of a PITA but they should get you there.