Showing posts with label DBT2. Show all posts

Monday, May 27, 2024

875X improvement from RonDB 21.04.17 to 22.10.4

At Hopsworks we are working on ensuring that the online feature store will be able to perform complex join operations in real-time. This means that queries that could use data from multiple tables can be easily integrated into machine learning applications.

Today most feature stores use key-value stores like Redis and DynamoDB. These systems have no capability to issue complex join queries, if this is required the feature store will have to write complex code to handle this and this is likely to involve multiple roundtrips and thus cause unwanted latency.

Hopsworks feature store uses RonDB as its online feature store. RonDB can handle any SQL operations that MySQL can handle. Actually RonDB has even support for parallelising the join queries and pushing the filtering and joining down to the RonDB data nodes where data resides.

This means that users of the Hopsworks feature store can integrate more features from multiple feature groups in online inferencing requests. This means that things credit fraud detection can be made much more intelligent by taking more features into account in the inferencing requests.

This means that performance of real-time join queries becomes more important in RonDB. To evaluate how RonDB develops in this are I ran a set of tests using TPC-H queries from DBT3 against RonDB 21.04.17 and RonDB 22.10.4 (not released yet). I also ran tests against MySQL 8.0.35 (RonDB 22.10.4 is based on MySQL 8.0.35 with loads of added RonDB features).

The results were interesting, the improvement in Q20 was the highest I have seen in my career. The performance improved from 70 seconds to 80 milliseconds, thus an 875x speedup or 87500% improvement. Q2 had a 360x improvement. So RonDB 22.10.4 is much better equipped for more complex queries compared to RonDB 21.04. MySQL 8.0.35 had similar performance to RonDB 22.10.4 with an average of around 20% slower, this is mostly due to performance improvements in RonDB, not algorithmic changes.

When using complex queries the query optimiser tries to find an optimal plan, sometimes however better plans are available and one can add hints in the SQL query to ensure a better plan is used.

The RonDB team isn't satisfied with this however, we have realised that evaluating aggregation is also very important when the online feature store stores a time window of certain features. This means that RonDB can compute aggregate dynamically and thus provide more accurate predictions.

Early tests of some simple single table queries showed an improvement of 4-5x and we expect we will be able to get to 10-20x improvements in quite a few queries of this sort.

Thursday, July 29, 2021

Improvements of DBT2 benchmark in RonDB 21.10.1

In the development of RonDB 21.10.1 we have had some focus on improving the performance of the DBT2 benchmark for RonDB. Actually NDB Cluster already had very good performance for DBT2. However this performance relies on a thread configuration that uses a lot of LDM threads and this means that tables will have very many partitions.

For an application like DBT2 (open source variant of TPC-C) this is not an issue since it is a very scalable application. But most real applications are not as scalable as DBT2 when the number of table partitions increases.

In RonDB we have focused on decreasing the number of table partitions. Thus in RonDB the number of partitions are independent of the number of LDM threads. In DBT2 most of the load are generated towards one of the tables, this means that only a subset of the LDM threads are used in executing DBT2. Even more most of the load is directed towards the primary replicas.

In RonDB 21.10.1 we improved the placement of the primary replicas such that more LDM threads were active in executing the queries. This improved DBT2 performance by about 20%.

Already in RonDB 21.04 we have introduced query threads that can be used for reads using Committed Reads. This makes application using Committed Reads scale very well such as the Online Feature Store in used by Hopsworks. However DBT2 uses a very small number of Committed Reads, most reads are using reads that lock rows. To handle this we modified RonDB 21.10 to allow also locked reads to use query threads.

Query thread already have an efficient scheduling of read queries towards LDM threads and query threads, thus ensuring that all CPUs used for LDM threads and query threads are efficiently used. With the ability to schedule locked read operations towards query threads we automatically make more efficient use of the CPU resources in the DBT2 benchmark. This improvement gives 50% better DBT2 performance for RonDB.

Another feature we made use of in the DBT2 benchmark is the ndb_import tool. Thus the load phase for DBT2 is using the ndb_import tool. This provides a very efficient parallel load tool. Both RonDB 21.04 and RonDB 21.10 contains improvements of the ndb_import tool to enable DBT2 to use it as a load tool.

Finally in RonDB 21.10.1 we also removed the index statistics mutex in the NDB storage engine as a bottleneck. This improves Sysbench throughput by about 10% at high load. We haven't measured how much it impacts the DBT2 performance.

New RonDB releases

It is in the middle of the summer, but we found some time to prepare a new RonDB release. Today we are proud to release new RonDB versions.

https://www.logicalclocks.com/blog/new-rondb-release-21-10-1

RonDB is a stable distribution of NDB Cluster, a key-value store with SQL capabilities. It is based on a release of MySQL, an SQL database server.

RonDB 21.04.1 is the second release of the stable version of RonDB. It contains 3 new features and 18 bug fixes. We plan to support RonDB 21.04 until 2024.

RonDB 21.10.1 is the first beta version of RonDB 21.10. It contains 4 new features that improves throughput of the DBT2 benchmark by 70% compared to RonDB 21.04.1.

Detailed release notes are available in the RonDB documentation.

The new features in RonDB 21.04.1 are:

Support for primary keys using Longvarchar in ClusterJ, the native Java API for RonDB
Support for autoincrement in the ndb_import tool
Killing ndbmtd now uses a graceful shutdown avoiding unnecessary abort

The new features in RonDB 21.10.1 are:

Improved placement of primary replicas
Removing index statistics mutex as a bottleneck in MySQL Server
More flexibility in thread configuration
Use query threads also for Reads which locks the row

Work on RonDB 21.04.2 is already ongoing and is mainly focused on backporting bug fixes from MySQL 8.0.24 through 8.0.26 that are deemed safe and important enough for a back port. The branch used for development of RonDB 21.04 is called 21.04.

Work on RonDB 21.10.2 has already started where we integrated changes from MySQL 8.0.26. The branch used for RonDB 21.10 is called 21.10.1.

There is ongoing work on RonDB to improve use of memory resources. This includes making schema memory use the global memory resources. It also introduces some common malloc functions using global memory resources. This development is a base for many future RonDB improvements that will make it easier to develop new features in RonDB. This work is found in the branch schema_memory_21102 currently.

Here is the GitHub tree for RonDB.

The flexible thread configuration was used for some research on thread pipelines presented in this blog.

Wednesday, May 05, 2021

Sysbench evaluation of RonDB

Introduction

Sysbench is a tool to benchmark to test open source databases. We have integrated Sysbench into the RonDB installation. This makes it extremely easy to run benchmarks with RonDB. This paper will describe the use of these benchmarks in RonDB. These benchmarks were executed with 1 cluster connection per MySQL Server. This limited the scalability per MySQL Server to about 12 VCPUs. Since we executed those benchmarks we have increased the number of cluster connections per MySQL Server to 4 providing scalability to at least 32 VCPUs per MySQL Server.

As preparation to run those benchmarks we have created a RonDB cluster using the Hopsworks framework that is currently used to create RonDB clusters. In these tests all MySQL Servers are using the c5.4xlarge VM instances in AWS (16 VCPUs with 32 GB memory). We have a RonDB management server using the t3a.medium VM instance type. We have tested using two different RonDB clusters, both have 2 data nodes. The first test is using the r5.4xlarge instance type (16 VCPUs and 128 GB memory) and the second test uses the r5n.8xlarge (32 VCPUs and 256 GB memory). It was necessary to use r5n class since we needed more than 10Gbit/second network bandwidth for the OLTP RW test with 32 VCPUs on data nodes.

In the RonDB documentation you can find more details how to set up your own RonDB cluster in either our managed version (currently supporting AWS) or using our open source shell scripts to set up a cluster (currently supporting GCP and Azure). RonDB is a new distribution of MySQL NDB Cluster. All experiments in this blog was performed using RonDB 21.04.0.

The graph below shows the throughput results from the larger data nodes using 8-12 MySQL Server VMs. Now in order to make sense of these numbers we will explain a bit more about Sysbench and how you can tweak Sysbench to serve your purposes for testing RonDB.

Description of Sysbench OLTP RW

The Sysbench OLTP RW benchmark consists of 20 SQL queries. There is a transaction, this means that the transaction starts with a BEGIN statement and it ends with a COMMIT statement. After the BEGIN statement follows 10 SELECT statements that selects one row using the primary key of the table. Next follows 4 SELECT queries that select 100 rows within a range and either uses SELECT DISTINCT, SELECT … ORDER BY, SELECT or SELECT sum(..). Finally there is one INSERT, one DELETE and 2 UPDATE queries.

In Pseudo code thus:

BEGIN

Repeat 10 times: SELECT col(s) from TAB where PK=pk

SELECT col(s) from TAB where key >= start AND key < (start + 100)

SELECT DISTINCT col(s) from TAB where key >= start AND key < (start + 100)

SELECT col(s) from TAB where key >= start AND key < (start + 100) ORDER BY key

SELECT SUM(col) from TAB where key >= start AND key < (start + 100)

INSERT INTO TAB values (....)

UPDATE TAB SET col=val WHERE PK=pk

UPDATE TAB SET col=val WHERE key=key

DELETE FROM TAB WHERE PK=pk

COMMIT

This is the standard OLTP RW benchmark.

Benchmark Execution

Now I will describe some changes that the Sysbench installation in RonDB can handle. To understand this we will start by showing the default configuration file for Sysbench.

# Software definition

MYSQL_BIN_INSTALL_DIR="/srv/hops/mysql"

BENCHMARK_TO_RUN="sysbench"

# Storage definition (empty here)

# MySQL Server definition

SERVER_HOST="172.31.23.248;172.31.31.222"

MYSQL_PASSWORD='3*=13*8@20.*0@7$?=45'

# NDB node definitions (empty here)

# Benchmark definition

SYSBENCH_TEST="oltp_rw"

SYSBENCH_INSTANCES="2"

THREAD_COUNTS_TO_RUN="1;2;4;8;12;16;24;32;48;64;96;112;128"

MAX_TIME="30"

In this configuration file we provide the pointer to the RonDB binaries, we provide the type of benchmark we want to execute, we provide the password to the MySQL Servers, we provide the number of threads to execute in each step of the benchmark. There is also a list of IP addresses to the MySQL Servers in the cluster and finally we provide the number of instances of Sysbench we want to execute.

This configuration file is created automatically by the managed version of RonDB. This configuration file is available in the API nodes you created when you created the cluster. They are also available in the MySQL Server VMs if you want to test running with a single MySQL Server colocated with the application.

The default setup will run the standard Sysbench OLTP RW benchmark with one sysbench instance per MySQL Server. To execute this benchmark the following steps are done:

Step 1: Log in to the API node VM where you want to run the benchmark from. The username is ubuntu (in AWS). Thus log in using e.g. ssh ubuntu@IP_address. The IP address is the external IP address that you will find in AWS where your VM instances are listed.

Step 2: After successfully being logged in you need to log into the mysql user using the command:

sudo su - mysql

Step 3: Move to the right directory

cd benchmarks

Step 4: Execute the benchmark

bench_run.sh --default-directory /home/mysql/benchmarks/sysbench_multi

Benchmark Results

As you will discover there is also a sysbench_single, dbt2_single, dbt2_multi directory. These are setup for different benchmarks that we will describe in future papers. sysbench_single is the same as sysbench_multi but with only a single MySQL Server. This will exist also on MySQL Server VMs if you want to benchmark from those. Executing a benchmark from the sysbench machine increases latency since it represents a 3-tiered setup whereas executing sysbench in the MySQL Server represents a 2-tiered setup and thus the latency is lower.

If you want to study the benchmark in real-time repeat Step 1, 2 and 3 above and then perform the following commands:

cd sysbench_multi/sysbench_results

tail -f oltp_rw_0_0.res

This will display the output from the first sysbench instance that will provide latency numbers and throughput of one of the sysbench instances.

When the benchmark has completed the total throughput is found in the file:

/home/mysql/benchmarks/sysbench_multi/final_results.txt

Modifying Sysbench benchmark

The configuration for sysbench_multi is found in:

/home/mysql/benchmarks/sysbench_multi/autobench.conf

Thus if you want to modify the benchmark you can edit this file.

So how can we modify this benchmark. First you can decide on how many SELECT statements to retrieve using the primary key that should be issued. The default is 10. To change this add the following line in autobench.conf:

SB_POINT_SELECTS=”5”

This will change such that instead 5 primary key SELECTs will be issued for each transaction.

Next you can decide that you want those primary key SELECTs to retrieve a batch of primary keys. In this case the SELECT will use IN (key1, key2,,, keyN) in the WHERE clause. To use this set the number of keys to retrieve per statement in SB_USE_IN_STATEMENT. Thus to set this to 100 add the following line to autobench.conf.

SB_USE_IN_STATEMENT=”100”

This means that if SB_POINT_SELECTS is set to 5 and SB_USE_IN_STATEMENT is set to 100 there will be 500 key lookups performed per Sysbench OLTP transaction.

Next it is possible to set the number of range scan SELECTs to perform per transaction. So to e.g. disable all range scans we can add the following lines to autobench.conf.

SB_SIMPLE_RANGES=”0”

SB_ORDER_RANGES=”0”

SB_DISTINCT_RANGES=”0”

SB_SUM_RANGES=”0”

Now it is also possible to modify the range scans. I mentioned that the range scans retrieves 100 rows. The number 100 is changeable through the configuration parameter SB_RANGE_SIZE.

The default behaviour is to retrieve all 100 rows and send them back to the application. Thus no filtering. We also have an option to perform filtering in those range scans. In this case only 1 row will be returned, but we will still scan the number of rows specified in SB_RANGE_SIZE. This feature of Sysbench is activated through adding the following line to autobench.conf:

SB_USE_FILTER=”yes”

Finally it is possible to remove the use of INSERT, DELETE and UPDATEs. This is done by changing the configuration parameter SYSBENCH_TEST from oltp_rw to oltp_ro.

There are many more ways to change the configuration of how to run Sysbench, but these settings are enough for this paper. For more details see the documentation of dbt2-0.37.50, also see the Sysbench tree

Benchmark Configurations

In my benchmarking reported in this paper I used 2 different configurations. Later we will report more variants of Sysbench testing as well as other benchmark variants.

The first is the standard Sysbench OLTP RW configuration. The second is the standard benchmark but adding SB_USE_FILTER=”yes”. This was added since the standard benchmark becomes limited by the network bandwidth using r5.8xlarge instances for the data node. This instance type is limited to 10G Ethernet and it needs almost 20 Gb/s in networking capacity with the performance that RonDB delivers. This bandwidth is achievable using the r5n instances.

Each test of Sysbench creates the tables and fills them with data. To have a reasonable execution time of the benchmark each table will be filled with 1M rows. Each sysbench instance will use its own table. It is possible to set the number of rows per table, it is also possible to use multiple tables per sysbench instance. Here we have used the default settings.

The test runs are executed for a fairly short time to be able to test a large variety of test cases. This means that it is expected that results are a bit better than expected. To see how results are affected by running for a long time we also ran a few select tests where we ran a single benchmark for more than 1 hour. The results are in this case around 10% lower than the numbers of shorter runs. This is mainly due to variance of the throughput that is introduced by the execution of checkpoints in RonDB. Checkpoints consume around 5-10% of the CPU capacity in the RonDB data nodes.

Benchmark setup

In all tests set up here we have started the RonDB cluster using the Hopsworks infrastructure. In all tests we have used c5.4xlarge as the VM instance type for MySQL Servers. This VM has 16 VCPUs and 32 GB of memory. This means a VM with more or less 8 Intel Xeon CPU cores. In all tests there are 2 RonDB data nodes, we have tested with 2 types of VM instances here, the first is the r5.4xlarge which has 16 VCPUs with 128 GB of memory. The second is the r5n.8xlarge which has 32 VCPUs and 256 GB of memory. In the Standard Sysbench OLTP RW test the network became a bottleneck when using r5.8xlarge. These VMs can use up to 10 Gb/sec, but in reality we could see that some instances could not go beyond 7 Gb/sec, when switching to r5n.8xlarge instead this jumped up to 13Gb/sec immediately, so clearly this bottleneck was due to the AWS infrastructure.

To ensure that the network bottleneck was removed we switched to using r5n.8xlarge instances instead for those benchmarks. These instances are the same as r5.8xlarge except that they can use up to 25 Gb/sec in network bandwidth instead of 10Gb/sec.

Standard Sysbench OLTP RW

The first test we present here is the standard OLTP RW benchmark. When we run this benchmark most of the CPU consumption happens in the MySQL Servers. Each MySQL Server is capable of processing about 4000 TPS for this benchmark. The MySQL Server can process a bit more if the responsiveness of the data node is better, this is likely to be caused by that the CPU caches are hotter in that case when the response comes back to the MySQL Server. Two 16 VCPU data nodes can in this case handle the load from 4 MySQL Servers, adding a 5th can increase the performance slightly, but not much. We compared this to 2 data nodes using 32 VCPUs and in principle the load these data nodes could handle was doubled.

The response time was very similar in both cases, at extreme loads the larger data nodes had more latency increases, most likely due to the fact that we got much closer to the limits of what the network could handle.

The top number here was 34870 at 64 threads from 10 MySQL Servers. In this case 95% of the transactions had a latency that was less than 19.7 ms, this means that the time for each SQL query was below 1 millisecond. This meant that almost 700k SQL queries per second were executed. These queries reported back to the application 14.5M rows per second for the larger data nodes, most of them coming from the 4 range scan queries in the benchmark. Each of those rows are a bit larger than 100 bytes, thus around 2 GByte per second of application data is transferred to the application (about 25% of this is aggregated in the MySQL when using the SUM range scan).

Sysbench OLTP RW with filtering of scans

Given that Sysbench OLTP RW is to a great extent a networking test we also wanted to perform a test that performed a bit more processing, but reporting back a smaller amount of rows. We achieved this by setting SB_USE_FILTER=”yes” in the benchmark configuration file. This means that instead of each range scan SELECT reporting back 100 rows, it will read 100 rows and filter out 99 of them and report only 1 of the 100 rows. This will decrease the amount of rows to process down to about 1M rows per second. Thus this test is a better evaluator of the CPU efficiency of RonDB whereas the standard Sysbench OLTP RW is a good evaluator of RonDBs ability to ship tons of rows between the application and the database engine.

At first we wanted to see the effect the number of MySQL servers had on the throughput in this benchmark. We see the results of this in the image above. We see that there is an increase in throughput going from 8 to 12 MySQL Servers. However the additional effect of each added MySQL Server is diminishing. There is very little to gain going beyond 10 MySQL Servers. The optimal use of computing resources is most likely achieved around 8-9 MySQL Servers.

Adding additional MySQL servers also has an impact on the variability of the latency. So probably the overall best fit here is to use about 2x more CPU resources on the MySQL Servers compared to the CPU resources in the RonDB data nodes. This rule is based on this benchmark and isn’t necessarily true for another use case.

The results with the smaller data nodes, r5.4xlarge is the red line that used 5 MySQL Servers in the test.

The rule definitely changes when using the key-value store APIs that RonDB provides. These are at least 100% more efficient compared to using SQL.

A key-value store needs to be a LATS database (low Latency, high Availability, high Throughput, Scalable storage). In this paper we have focused on showing Throughput and Latency. Above is the graph showing how latency is affected by the number of threads in the MySQL Server.

Many applications have strict requirements on the maximum latency of transactions. So for example if the application requires response time to be smaller than 20 ms than we can see in the graph that we can use around 60 threads towards each MySQL Server. At this number of threads r5.4xlarge delivers 22500 TPS (450k QPS) and r5n.8xlarge delivers twice that number, 45000 TPS (900k QPS).

The base latency in an unloaded cluster is a bit below 6 milliseconds. This number is a bit variable based on where exactly the VMs are located that gets started for you. Most of this latency is spent in latency on the networks. Each network jump in AWS has been reported to be around 40-50 microseconds and one transaction performs around 100 of those network jumps in sequence. Thus almost two-thirds of the base latency comes from the latency in getting messages across. At higher loads the queueing waiting the message to be executed becomes dominating. Benchmarks where everything executes on a single computer has base latency around 2 millisecond per Sysbench transaction which confirms the above calculations.

Saturday, January 17, 2009

New DBT2 version uploaded with more documentation of new scripts

I have had a number of request for help on how to use the DBT2
tree I'm maintaining on www.iclaustron.com. There is an extensive
set of scripts used to make it very easy to run DBT2 runs and
to start and stop cluster nodes and MySQL Servers. I personally
use it also to start MySQL Servers and clusters also when not
using DBT2.

However these scripts haven't had an overall description yet
although each component is very thoroughly documented by
using --help on the scripts (I tend to document very
heavily these things since I otherwise forget it myself).

Now I added a new README file README-ICLAUSTRON which
explains which scripts are used and their relation
and which configuration files to set-up and a
pointer to example configuration files.

Hopefully this will make it easier to use DBT2,
particularly DBT2 for MySQL Cluster.

Tuesday, September 09, 2008

Linear Scalability of MySQL Cluster using DBT2

To achieve linear scalability of MySQL Cluster using the DBT2
benchmark has been a goal of mine for a long time now. Last
week I finally found the last issue that limited the scalability.
As usual when you discovered the issue it was trivial (in this
case it was fixed by inserting 3 0's in the NDB handler code).

We can now achieve ~41k TPM on a 2-node cluster, ~81k on a
4-node cluster and ~159k TPM on a 8-node cluster giving roughly
97% improved performance by doubling number of nodes. So there
is nothing limiting us now from achieving all the way up to
1M TPM except lack of hardware :)

I've learned a lot about what affects scalability and what
affects performance of MySQL Cluster by performing those
experiments and I'll continue writing up those experiences on
my blog here. I have also uploaded a new DBT2 version where I
added a lot of new features to the DBT2, improved performance
of the benchmark itself and also ensured that running with many
parallel DBT2 drivers do still provide correct results when
adding the results together. It can be downloaded from
www.iclaustron.com

Thursday, July 31, 2008

1: Making MySQL Cluster scale perfectly in the DBT2 benchmark: Initial discussion

Since 2006 H1 I've been working on benchmarking MySQL
Cluster using the DBT2 test suite. Initially this meant
a fair amount of work on the test suite itself and also
a set of scripts to start and stop NDB data nodes, MySQL
Servers and all the other processes of the DBT2 test.
(These scripts and the DBT2 tests I'm using is available
for download at www.iclaustron.com)

Initially I worked with an early version of MySQL Cluster
based on version 5.1 and this meant that I hit a number
of the performance bugs that had appeared there in the
development process. Nowadays the stability is really good
so in the most case I've spent my time focusing on what
is required to use in the operating system and the
benchmark application for optimum scalability.

Early on I discovered some basic features that were required
to get optimum performance of MySQL Cluster in those cases.
One of them is to simply use partitioning properly. In the
case of DBT2 most tables (everyone except the ITEM table) can
be partitioned on the Warehouse id. So the new feature I
developed as part of 5.1 came in handy here. It's possible to
use both PARTITION BY KEY (warehouse_id) or PARTITION BY
HASH (warehouse_id). Personally I prefer PARTITION BY HASH
since it spreads the warehouses perfectly amongst the data
nodes. However in 5.1 this isn't a fully supported so one has
to start the MySQL Server using the flag --new to use this
feature with MySQL Cluster.

The second one was the ability to use the transaction
coordinator on the same node as the warehouse the
transaction is handling. This was handled by a new
feature introducted in MySQL Cluster Carrier Grade
Edition 6.3 whereby the transaction coordinator is
started on the node where the first query is targeted.
This works perfectly for DBT2 and for many other
applications and it's fairly easy to change your
application if it doesn't fit immediately.

The next feature was to ensure that sending uses as
big buffers as possible and also to avoid wake-up
costs. Both those features meant changes to the
scheduler in the data nodes of the MySQL Cluster.
These changes works very well in most cases where
there is sufficient CPU resources for the data nodes.
This feature was also introduced in MySQL Cluster CGE
version 6.3.

Another feature which is very important to achieve
optimum scalability is to ensure that the MySQL Server
starts scans only on the data nodes where it will
actually find the data. This is done through the use
of partition pruning as introduced in MySQL version
5.1. Unfortunately there was a late bug introduced
which I recently discovered which gave decreased
scalability for DBT2 (this is bug#37934 which contains
a patch which fixes the bug, it hasn't been pushed yet
to any 6.3 version).

With these features there were still a number of scalability
issues remaining in DBT2. One was the obvious one that the
ITEM table is spread on all data nodes and thus reads of the
ITEM table will use network sockets that isn't so "hot".
There are two solutions to this, one is that MySQL Cluster
implements some tables as fully replicated on all data nodes.
This might arrive some time in the future, the other variant
uses standard MySQL techniques. One places the table in
another storage engine, e.g. InnoDB, and uses replication to
spread the updates to all the MySQL Servers in the cluster.
This technique should be a technique that can be applied to
many web applications where there are tables that need to be
in MySQL Cluster to handle availability issues and that the
data is required to be updated through proper transactions, but
there are also other tables which can be updated in a lazy
manner.

Finally there is one more remaining issue and this is when the
MySQL Server doesn't work on partitioned data. That is in the
case of DBT2 if all MySQL Servers can access data in a certain
node group then the data nodes will have more network sockets to
work with which will increase cost of networking. This limits
scalability as well.

In the case of DBT2 this can be avoided by using a spread
parameter that ensures that a certain MySQL Server only uses a
certain node group in the MySQL Cluster. In a generic application
this would be handled by an intelligent load balancer that
ensures that MySQL Servers works on different partitions of
the data in the application.

What I will present in future blogs is some data on how much the
effects mentioned above have on the scalability of the DBT2
benchmark for MySQL Cluster.

What is more surprising is that there is also a number of other
issues related to the use of the operating system which aren't
obvious at all. I will present those as well and what those mean
in terms of scalability for MySQL Cluster using DBT2.

Finally in a real application there will seldom be a perfect
scalability occuring, so in any real application it's also
important to minimize the impact of scalability issues. The
main technology to use here is cluster interconnects and I
will show how the use of cluster interconnects affects
scalability issues in MySQL Cluster.

Note numbers from these DBT2 are merely used to be used here to
compare different configurations of MySQL Cluster.

Mikael Ronstrom