Friday, November 03, 2023

Presentation of RonDB at Meetup

For those that didn't have a chance to come to Stockholm and listen to the presentation of RonDB, here are the slides from the presentation.

The presentation presents the Requirements, Architecture, Status of RonDB and its use in Hopsworks and other applications.

Thursday, October 26, 2023

Results on comparing new Intel/AMD VMs with older VM types using RonDB

 In Hopsworks cloud offering for GCP one can select a fairly large variety of VM types. I am currently working on extending this list to also include the latest generation of VM types. This blog will focus on the impact of those new VM types for benchmarks using RonDB.

The newer VM types is the c3d-serie that uses AMD EPYC CPUs of the 4th generation and the c3-series which contains VMs using the Intel Saphire Rapid CPUs. Also AWS has introduced similar new VM types, but this blog discuss tests performed on VMs in GCP.

The older VM types we compared with for the MySQL Servers was the n2-standard-16 VM type. This VM uses an Intel Cascade Lake Xeon processor. This represents the second generation Intel Xeon chips whereas Intel Saphire Rapid represents the 4th generation Intel Xeon.

The RonDB data nodes used the e2-highmem-16 as the baseline for comparison. This VM types uses either an Intel Xeon of the second generation or an AMD EPYC of the second generation.

The benchmark used was Sysbench OLTP RW based on version 0.4.12.19 which is included in the RonDB tarball and is setup in the API nodes automatically by our cloud offering. This makes it extremely easy to replicate the benchmarks. We use Consul as a load balancer, so the benchmark process is setup to a single host onlinefs.mysql.service.consul. In reality this address maps to the number of MySQL Servers in the RonDB cluster. We used 3 MySQL Servers in the tests. The setup used 2 RonDB data nodes in one node group.

Thus in the Hopsworks cloud we get a load balanced RonDB Data Service as part of the infrastructure of the Hopsworks Feature Store.

We first executed the benchmark using the old VM types to get a baseline. The next step was to upgrade the RonDB MySQL Servers to use c3d-highmem-16. Thus the same amount of memory and number of CPUs as in n2-standard-16 but upgraded from Intel 2nd generation to AMD 4th generation.

This impacted the throughput mainly. The baseline experiment executed 9000 TPS and was limited by the CPUs in the MySQL Servers (they used 1550% of the 1600% available). The c3d-highmem-16 delivered 11400 TPS but only using 1000% of the available 1600%. Thus the throughput per CPU increased by around 100%. In this execution the bottleneck of the benchmark was the RonDB data nodes.

The benchmark API node was consistently a n2-standard-48 VM. This meant that most communication went from API VM of old type, to MySQL Server of new type, to RonDB data node VM of old type. Thus in all communication an old VM type was involved. The network latency was the same in this experiment as in the baseline experiment.

The change from one VM type was using the Reconfiguration support RonDB have in its Cloud offering. This change is an online operation where the cluster remains operational and the new MySQL Servers are included in the Consul setup as soon as they have started up. Only when nodes are stopped could temporary errors happen that can be handled with a simple retry logic.

Next we changed also the VM type of the RonDB data nodes to be c3d-highmem-16 using the same online reconfiguration as for the MySQL Servers.

What we quickly noted in this setup was that the latency per transaction was cut in half. Thus performance using a single thread decreased to less than half. Thus it is clear that communication between 2 VMs of the new type have more than 100% improvements on network latency. The throughput now increased to 17800 TPS and the bottleneck was now in the MySQL Servers. Thus throughput improvement is almost 98% and network latency improved by more than 100%.

When reading the announcement of the C3 machine series and the description of the C3D machine series, it is clear that the new IPU (Infrastructure Processing Unit) that takes care of offloading networking is a major reason for this improved network latency.

Analysing the Sysbench transaction in this setup there will be around 100 network messages, most of them in serial order. Still the latency of a transaction execution is no more than 6 milliseconds to execute the 20 SQL queries involved in the OLTP RW transaction. Thus a medium of 60 microsecond per message and this includes the time to also execute the RonDB Data node code and the RonDB MySQL Server code.

Next step was to again change MySQL Server VMs. This time we changed to c3-highmem-22. Unfortunately the VM type c3-highmem-16 didn't exist. So the comparison isn't perfect, but at least it gives a good estimate of the improvements in Intel's 4th generation CPUs.

The network latency was the same for Intel and AMD 4th generation VM types. The throughput increased by around 40% up to around 24000 TPS. Since the number of CPUs increased by around 40% as well, it seems that c3-serie and c3d-serie is very similar in handling throughput when used in RonDB MySQL Servers.

To test the throughput of those new VMs we ran the test using c3-highmem-8 and c3d-highmem-8 VM types as RonDB Data node VMs. The performance of those two VM types was almost indistinguishable, to the point where I started wondering if they were the same CPUs. Throughput was half the throughput of the 16 VCPU VMs.

The main conclusion of these tests is that upgrading from 2nd generation x86 CPUs to 4th generation x86 CPUs in the GCP cloud provides a 100% improvement in throughput and a similar improvement of the network latency.

The price of those VMs is higher, but substantially less than 100%. So it makes a lot of sense to start using those new VM types for new applications.

The tests were performed using the RonDB version 21.04.15. We are about to release a new LTS version of RonDB, version 22.10.1. There will be a more thorough benchmark report when this is released.

Friday, September 29, 2023

Release of RonDB 21.04.15

 We have worked hard on ensuring stability and adding the required features for our customers lately. Thus the RonDB 21.04.15 release has reached a very high quality level and will be able to sustain users of it until they desire to upgrade to a newer release of RonDB.

Most of the changes in this release is related to the new REST API server that makes it possible to read using single reads or batch reads using primary key lookups through a REST protocol or through a gRPC protocol. The REST API server also supports reading directly from the Hopsworks Feature Store that takes into account the metadata model of the Hopsworks Feature Store.

Much of the work around RonDB is centered around automated management of RonDB. To this end we have developed the ndb-agent that makes it possible to create a cluster, stop the cluster, start the cluster again, take a backup, delete a backup, restore from backup and finally to reconfigure the cluster as an online operation.

Reconfigure the cluster means adding or removing replicas, increasing the size of data node VMs. It means that MySQL Server VMs can be added, changed and dropped as needed by the application.

All of those operations are already operational and working. We are now working on an improvement that speeds up the change process significantly. Adding a new MySQL Server can now be done in 2-3 minutes and most of this time is spent on creating the new VM in the choosen cloud (Hopsworks supports AWS, GCP and Azure).

The new ndb-agent works in the same fashion as Kubernetes through maintaining a desired state. This means that it is fairly straightforward for the ndb-agent to support both our cloud offering and a Kubernetes setup.

RonDB development is now focused on the new RonDB release 22.10.1. This will introduce 8 new features. The most important feature is supporting variable sized disk columns. RonDB 22.10 has been in development and testing for almost 3 years already, so it is already a very stable release. It brings in addition a number of performance improvements.

The release notes for RonDB 21.04.15.

The full set of new features in RonDB 21.04.

The full set of new features in RonDB 22.10.

The new Hopsworks release also makes use of Replication between RonDB clusters. A Hopsworks cluster can use a single small RonDB cluster and can grow into an Enterprise setup with several large RonDB clusters and replicated between regions far away from each other.

RonDB is used to handle the Online Feature Store, the metadata of the Hopsworks Feature Store and the metadata of HopsFS. HopsFS is the storage of the Offline Feature Store. HopsFS is a distributed file system that can store many petabytes of data in an efficient manner. Hopsworks Offline Feature Store makes use of DuckDB to perform complex analysis of the data to train AI models and perform batch inferencing.

Thus RonDB is a critically important component in the next generation AI system developed at Hopsworks. All large companies around the world is considering how they can build their AI models and supporting system. Hopsworks is providing a platform for those companies, both small and very large companies.

Hopsworks provides a free service where anyone can get a free Hopsworks account at https://app.hopsworks.ai and try out the service themselves.

Tuesday, August 01, 2023

Modernising a Distributed Hash implementation

 As part of writing my Ph.D thesis about 30 years ago I invented a new distributed hash algorithm called LH^3. The idea is that I apply the hashing in 3 levels. The first level uses the hash to find the table partition where the key is stored, the second level uses the hash to find the page where the key is stored and the final step uses the hash to find the hash bucket where the key is stored.

The algorithm is based on linear hashing and distributed linear hashing developed by Witold Litwin that I had the privilege at the time to have many interesting discussions with. My professor Tore Risch had collaborated a lot with Witold Litwin. I also took the idea of storing the hash value in the hash bucket to avoid having to compare every key from Mikael Pettersson, another researcher at University of Linköping.

The basic idea is described in my Ph.D thesis. The implementation in MySQL Cluster and in RonDB (fork of MySQL Cluster) is still very much similar to this. This hash table is one of the reasons of the good performance of RonDB, it makes sure that the hash lookup normally only hits one CPU cache miss during the hash search.

At Hopsworks we are moving the implementation of RonDB forward with a new generation of developers, in this particular work I am collaborating with Axel Svensson. The best method to learn the code is as usual to rewrite the code. RonDB has access to more memory allocation interfaces compared to MySQL Cluster, so I thought this could be useful.

Interestingly going through the requirements on memory allocations with a fresh mind more or less comes to the same conclusions as 30 years ago. So after 30 years of developing the product one can rediscover the basic ideas underlying the product.

The original implementation made it possible to perform scan operations using the hash index. However this led to a 3x increase of complexity of the implementation. Luckily nowadays one can also scan using the row storage. Thus in RonDB we have removed the possibility to scan using the hash index. This opens up for rewriting the hash index with much less complexity.

A hash implementation thus consists of the following parts, a dynamic array to find the page, a method to handle the memory layout of the page, a method to handle the individual hash buckets and finally a method to handle overflow buckets.

What we found is that the dynamic array can be much more efficiently implemented using the new memory allocation interfaces. The overflow buckets can potentially be handled with other techniques other than just overflow buckets, one could also handle them using recursive hashing.

What we have found is that the idea of using pages and hash buckets inside those pages is still a very good idea for a hash table that must be very adaptable to both increasing sizes and decreasing sizes.

Modern CPUs have new instructions to handle parallel execution of searches, this can be used to speed up the lookup in the hash buckets.

On top of this the hash function used in RonDB is MD5, this is replaced with a new hash function XX3_HASH64 that is about 30x faster.

A new requirement in RonDB compared to MySQL Cluster is that we work with applications that constantly create and drop tables and also the number of tables can be substantial and thus there could be many very small tables. This means that a small table could make use of an alternative and much simpler implementation to save memory.

This is work in progress, it serves a number of purposes, it is a nice way to learn the RonDB code base for new developers, it means that we can save memory for hash indexes, it means that we can make the implementation even more optimised, it simplifies the code thus making it easier to support it and it makes use of the new modern CPU instructions to substantially speed up the hash index lookups.

Saturday, July 08, 2023

Number theory for birthdays and IQ tests

 I have always been interested in numbers and playing with them since I was a small kid. Every time someone has a birthday I am always ready to provide an alternative to have the normal decimal birthday. So e.g. having your 100th birthday when you really have your 49 birthday in decimal numbers.

So here is some number theory for birthdays and IQ tests that you can play around with on your vacations days and prepare for future birthdays and IQ tests. Have fun.

First some short introduction to numbers and the number base. When we use numbers we always assume we're counting with decimal numbers. Decimal numbers means that we are using 10 as the base. Thus when we are saying that someone is 25 years old we really mean that he is 2 * 10^1 + 5 * 10^0 = 2 * 10 + 5 year old. If instead someone has his 25 year birthday in octal number what we are saying is that he has his 2 * 8^1 + 5 * 8^0 = 2 * 8 + 5 = 21 in decimal numbers.

So by varying the number base we can have almost any birthday changed into an even birthday. For example when can we say that we have our 100th birthday. The smallest base is 2, this means that our first 100th birthday happens already at our 4th birthday using base 2. Later in life we can have a 100th birthday when we have 9th birthday, 16th birthday, 25th, 36th, 49th, 64th, 81st and 100th. It is very unlikely that someone will celebrate their 100th birthday in base 11 which would happen at age of 121.

Thus celebrating 100 years happens quite a few times, but not very often still.

Other even numbers are more common. We can have our 20th birthday every second year from our 6th birthday. To be 20, the minimum base is 3 since the number 2 cannot be used in base 2 that only have numbers 0 and 1. Thus 2 * 3 + 0 = 6 is the minimum age to become 20.

However after 6 years old you can always have your 20th birthday at any birthday which is an even number. Thus e.g. at age 38 you will be 20 using base 19, 2 * 19 + 0 = 38.

If you want to search for an appropriate age to celebrate on your next birthday, start by dividing your age into a product of prime numbers. So e.g. 38 is the product of 2 and 19 which both are prime numbers. Thus the most even numbers you can get here is 20 in base 19 and 100110 in base 2. If your age is 18 you have more options, this is divided into prime numbers 2,3 and 3 since 18 = 2 * 3 * 3. So here you can have your 200th birthday in base 3 and 10010 in base 2.

However you stumble into issues with the above approach when the age you have achieved is a prime number itself. So for example when your 37th birthday approaches, how will you divide this into an even number to celebrate. The only obvious even number to reach here is 10 years old which can be achieved with any prime number by using the prime number itself as the base.

Here the age 25 comes to the rescue which is seen as an even birthday by most people. Actually we can prove that every birthday with an odd number of years can become 25 in some base if the odd number is at least 17.

Proof: The proof is fairly simple, first of all an odd number is always written as 2 * k + 1 where k is any number. Second the minimum base to use for an age of 25 is 6 since the number 5 doesn't exist in bases 2,3,4 and 5. Thus the first time to have your 25th birthday happens on your 2 * 6 + 5 = 17th birthday. 

So choose any odd number larger than or equal to 17. This number can always be written as 2 * k + 1 where k is at least 8. But it can also be written as 2 * (k - 2) + (1 + 2 * 2) = 2 * (k - 2) + 5 = 25 in base k - 2. Thus to calculate the number base to use one calculates:

(Odd - 5) / 2. Thus with 37 you get (37 - 5) / 2 = 16. Thus at your 37th birthday you have 25th birthday in base 16.

Isn't it nice to know that you can always claim to be 20 or 25 years old after reaching 17 years of age for the rest of your life :)

Have fun on future birthday in figuring what age you want to have this time.

Actually the base 10 was selected in Arabia, in many older cultures the base 12 was used, even some money systems still have the number 12 in them. If you are working with computer programs it is very popular to use hexadecimal numbers using base 16 with digits 1,2,3,4,5,6,7,8,9,A,B,C,D,E and F.

So on to IQ tests. Most of you have seen tests like the below one:

1, 4, 9, 16, ?

This one is fairly easy, it is the square of the index, thus x^2 is the function in this number series. Thus the next number in the series is 5 * 5 = 25.

Let's take a bit more complex number series now.

2, 9, 28, 65, ?

This one is a bit more difficult to see directly, I will give a hint, it is based on the function x^3 + 1. Thus the next number is 5 * 5 * 5 + 1 = 126.

Now let's take another one, we use the function x^2 - 2 * x - 2.

-3, -2, 1, 6, ?

This looks difficult at the outset, but since we know the answer we cheat and simply set it to 5 * 5 - 2 * 5 - 2 = 13.

So how does one solve this type of IQ tests in a quick manner. Well it is fairly simple using difference techniques, a bit like Fibonaccis tree.

So write the difference between the numbers and then the difference of the differences.

In the above calculation we write it up as follows.

-3,  -2,  1,  6,  13

   1,  3,  5,  7

      2,  2,  2

Interesting the difference is simply a linearly increasing function which is very easy to see and the second difference is simply constant so even easier.

We can see that the difference function is simply 2 * x - 2 and the second difference is simply 2.

For those familiar with derivatives, you can see that 2 * x - 2 is the derivative function of x^2 - 2 * x - 2. and 2 is the second derivative of this function.

So now let's try if this works in practice, here is a number series again:

0, 1, 8, 27, ?

We use the difference technique:

0,  1,  8,  27, => 64

  1,  7,  19, => 37, 

     6,  12, => 18

So we write the answer to be 64. Now let's check the answer, the function I used in this case was:

x^3 - 3 * x^2 + 3 * x - 1

Thus using x = 5 we get 5^3 - 3 * 5^5 + 3 * 5 - 1 = 125 - 75 + 15 - 1 = 64

Thus we found the correct answer of a fairly complex IQ test and we can claim to be more intelligent than we really are :)

Have fun in showing off your capabilities in IQ tests.

Saturday, April 29, 2023

Status report RonDB development

 What is going on with RonDB development. Actually a lot, but most happens under the radar at the moment. So this blog will give any interested some idea about what is going on.

RonDB core development is further development of the fork of MySQL NDB Cluster. For the most part this development is focused on our production version RonDB 21.04 that is used at numerous companies in production. Development is very centered around supporting the Hopsworks platform. This means that we now have added 27 new features on top of MySQL NDB Cluster and 127 bug fixes. The latest feature is an improvement of the node recovery. This improvement can bring up to 4-8x shorter restart times. This was seen as an important improvement to ensure that Online Reconfiguration of RonDB in our cloud setting is speedy.

We now have 3 main versions of RonDB core. The RonDB 21.04 that we use in production. RonDB 22.10 that is prepared for use in production. It brings the possibility to store 10x more data in RonDB compared to RonDB 21.04 important for large customers and large applications. We have also started work on the next RonDB generation in RonDB 23.04 that is integrated with MySQL 8.0.33 already.

Managed RonDB has been delivered in two steps. The first integrated the possibility to start up, backup, stop and restore a RonDB database. The configuration is specified in numbers of replicas, number of MySQL Servers and type of VMs for the various node types. One can start the cluster either through a UI or through Terraform.

Now the second step is working as well, this step introduces Online RonDB Reconfiguration. One can change the number of replicas, change the VM types of the nodes and increase/decrease the number of MySQL Servers. This is currently an experimental feature available to our customers on request. The change is fully online and has been verified in internal Hackathons where our developers test various Hopsworks features while the RonDB cluster is reconfiguring.

We are now working on a third step that makes changes more efficient and uses the Kubernetes model with desired state. So the cloud specifies the new desired state and the agent software will ensure that the RonDB cluster moves to this new desired state. Anyone can run RonDB in Docker and try out those new changes on their own laptops.

Those steps are also available using Docker with the rondb-docker github tree. We use Docker as a development platform making it easy to test thousands of state transformations at various levels. Soon there will be videos and blogs describing how to use Docker to test RonDB Reconciliation that will be accessible from the github tree.

It doesn't stop there, a major focus is currently on developing the first version of the RonDB REST API server. This makes it easy to access RonDB using a REST service in parallel with the MySQL Server and more efficient NDB API applications. We have already seen a great interest in this API even before it is completed.

We are also working on automating replication between clusters in different regions.

As usual there is also a set of interesting product ideas on how to improve the RonDB core with even more flexibility in growing and shrinking, making use of SIMD operations to speed up various parts of RonDB and some thoughts on long-term development projects as well.

As usual a benchmark or two is in the works as well. These are further developments of the benchmark described on www.rondb.com where we show throughput and latency of YCSB both in normal operations as well as during recovery.

Thursday, March 23, 2023

Laptop vs Desktop for RonDB development

 Most developers today use laptops for all development work. For the last 15 years I have considered desktops and laptops to be very similar in performance and use cases. This is no longer the case as I will discuss in this blog.

Personally I use a mix of laptops and desktops. For me the most important thing as a developer is the screen resolution and the speed of compilation. But I have now found that desktops can be very useful for the test phase of a development project, in particular the later testing phases.

Many years ago I read that one can increase productivity by 30% by using a screen with higher resolution thus fitting more things at the same time on the screen. Personally I have so far found 27" screens to be the best size, larger size means neck pain and smaller means that productivity suffers. The screen resolution should be as high as your budget allows.

My experience is that modern laptops can be quite efficient in compilation. There is very little difference in compilation time towards desktops.

However recently I tested running our new RonDB Docker clusters on laptops and desktops. What I have seen is that the performance of these tests can differ up to 4x.

I think the reason for this large difference is that desktops can sustain high performance for a long time. Some modern desktops can handle CPUs that use more than 200W whereas most laptops will be limited to about 45W. For a compilation that only runs for about 5 minutes and have some serialisation the difference becomes very small. The most important part for compilation is how fast the CPU is on single-threaded performance and that it can scale the compilation to a decent number of CPUs.

However running a development environment for RonDB means running a cluster on a single machine where there are two data node processes, two MySQL server processes and a management process and of course any number of application processes. A laptop can handle this nicely and the performance for a single-threaded application is the same for laptop and desktop. However when scaling the test to many threads the laptop hits a wall whereas the desktop simply continues to scale.

The reason is twofold, the desktop CPUs can have more CPU cores. Most high-end laptops today have around 8-10 CPU cores. The high-end desktops today however goes to around 16-24 CPU cores. In addition the desktop can usually handle more than 4x as much power. The power difference and the core difference delivers a 4x higher throughput in heavy testing.

Thus my conclusion is that laptops are good enough for the development phase together with an external screen. However when you enter the testing phase when you need to run heavy functional tests and load tests on your software a desktop or a workstation will be very useful.

In my tests on a high-end desktop I ran a Sysbench OLTP RW benchmark using the RonDB Docker environment, I managed to run up to 15.000 TPS. This means running 300.000 SQL queries per second towards the MySQL servers and the data nodes. The laptop could handle roughly 25% of this throughput.

Obviously the desktop could be a virtual desktop in the modern development environment. But a physical machine is still a lot more fun.

RonDB is part of the Hopsworks Feature Store platform.


Thursday, March 02, 2023

3 commands to start a RonDB cluster

 RonDB is a key-value store with SQL capabilities. We are working on making it really easy to develop applications against RonDB. You can now get a RonDB cluster up and running using 3 commands on your development machine assuming you have Docker installed there.

Here are the commands:

1. git clone https://github.com/logicalclocks/rondb-docker rondb-docker

2. cd rondb-docker

3. ./run.sh

Prerequisites is that you have git installed and Docker or Docker Desktop. Using Docker Desktop and a new Resource Extension one can see the usage of the various containers in both memory and CPU usage. Using it on Windows also requires WSL 2 to be installed.

If you are using Windows it is important that you have set it to use WSL 2 as the engine. One might also have to activate WSL 2 integration with the Linux distribution you are using in the WSL 2. Both of those can be set from the Docker Desktop settings pages. One need to start a new Linux terminal after changing those settings before it actually works.

When trying it on Windows 11 it has worked like a charm for me. But trying it on Windows 10 had issues with firewalls preventing the MySQL Server to start. Feel free to post comments to this blog if you found issues and workarounds for those.

The run.sh command will create the docker image by pulling it from DockerHub. It is a download of a several hundred MBytes, so the time takes depends on the speed of your interconnection. Next it starts a RonDB cluster with 1 MGM server, 2 MySQL Server and 2 Data nodes.

When it started you can access the MySQL Servers on port 15000 and 15001 using a normal MySQL client or the application you are developing.

To access the MySQL Servers you can run the below command using a MySQL client.

mysql --protocol=tcp --user=mysql --host=localhost --port=15000 -p

Enter the password Abc123?e and you are connected to the MySQL Server and can use it as a normal MySQL client connected to a MySQL Cluster. The mysql user have full access to the ycsb% databases, the sbtest% databases, sysbench% databases and the dbt% databases.

You can enter the docker containers in the normal manner using

docker exec -it docker_id /bin/bash

You find the docker_id using the docker ps command.

You can use the run.sh script to create the RonDB cluster of your choice. It has 5 predefined profiles (mini, small, medium, large, xlarge). All profiles have the same nodes except mini which only creates 1 MySQL Server and 1 data node.

We have tested this using Docker Desktop on Mac OS X, Docker Desktop on Windows using WSL 2 and using Docker on Linux. So most developers should be able to try it out in their environment of choice. 

Tuesday, January 10, 2023

The flagship feature in the new LTS version RonDB 22.10.0

 In RonDB 22.10.0 we added a new major feature to RonDB. This feature means that variable sized disk columns in RonDB are stored in variable sized rows instead of using fixed size rows.

The history of disk data in RonDB starts already in 2004 when the NDB team at Ericsson had been acquired by MySQL AB. NDB Cluster was originally designed as an in-memory DBMS. The reason for this was based on that a disk based DBMS couldn't handle the latency requirements in telecom applications.

Thus NDB was developed using a distributed architecture using Network Durability (meaning that a transaction is made durable by writing the transaction into memory in several computers in a network). Long-term durability of data is achieved by a background process ensuring that data is written to disk.

When the NDB team joined MySQL we looked for many other application categories as well and thus increasing the database sizes NDB could handle was seen as important. Thus we started on developing support for disk-based columns. The design decisions of this design was accepted as a paper at VLDB in Trondheim in 2005.

The use of this feature didn't really take off in any significant manner for a few years since the latency of hard drives and also the performance of hard drives made it too different from the performance of in-memory data.

That problem has been solved by technology development of SSDs and with the introduction of NVMe drives and newer versions of PCI Express 3,4 and now 5. As an anecdote I installed a set of NVMe drives on my workstation capable of handling millions of IOPS and able to deliver 66 GBytes per second of bandwidth to these NVMe drives. However while installing I discovered that I had only 1 memory card which meant that I had 3x more bandwidth to my NVMe drives compared to my memory bandwidth. So in order to make use of those NVMe drives I had to install a number of memory cards to get the required memory bandwidth to handle those NVMe drives.

So with the introduction of NVMe drives the feature became useful, actually one of the main users of this feature is HopsFS, a distributed file system in the Hopsworks platform which uses RonDB for metadata management. HopsFS can use disk columns in RonDB for storing small files.

Performance of disk columns is really good. This blog presents a benchmark with YCSB using disk-based columns in NDB Cluster. We get a bandwidth of more than 1 GByte per second of application data read and written.

The latency on NVMe drives is 100x lower than on hard drives. This means that previously latency on hard drives was a lot more than 100x higher than in-memory latency for database operations. With modern NVMe drives the difference on latency between in-memory columns and disk columns is down to a factor of 2. We analysed performance and latency using the YCSB benchmark and compared it to in-memory columns in this blog.

One problem with the original implementation is that the disk columns was always stored in fixed size rows. In HopsFS we found ways to handle this by using multiple tables for different row sizes.

In a traditional application and in the Feature Store it is very common to store data in variable sized columns. To ensure that the data fits the maximum size of the column can be 10x higher than the average size of the column. Thus we can easily waste 90% of the disk space. This means that to use disk columns in Feature Store applications we have to enable support of variable sized rows on disk.

Thus with the release of the new LTS RonDB version 22.10.0 the disk columns is now as useful as the in-memory columns. They have excellent performance, the latency is very good, even better than the in-memory latency of some competitors and the storage efficiency is now high as well.

This means that with RonDB 22.10.0 we can handle nodes with TBytes of in-memory and many tens of TBytes of disk columns. Thus RonDB can scale all the way up to handling database sizes up to the petabyte level with latency of read and write operations in less than a millisecond.

Summary of RonDB 21.04.9 changes

 RonDB 21.04 main use case is being the base of the data management platform in Hopsworks. As such every now and then some new requirements on RonDB emerges. But obviously the most important feature of development of RonDB 21.04 is on stability.

Hopsworks provides a free Serverless use case to try out the Hopsworks platform. Check it out on https://app.hopsworks.ai. Each user gets their own database in RonDB and can create a number of tables. Then one can load data from various sources using the OnlineFS (a feature using Kafka and ClusterJ to load data from external sources into Feature Groups, a Feature Group is a table in RonDB).

Previously ClusterJ was limited to using only one database per cluster connection which led to a lot of unnecessary connect and disconnect of connections to the RonDB cluster. In RonDB 21.04.9 it is now possible for one cluster connection to use any number of databases.

In addition we did a few changes to RonDB to make it easier to manage RonDB in our managed platform.

In preparation for releasing Hopsworks 3.1 which includes RonDB 21.04.9 we extended the tests for the Hopsworks platform, among other things for HopsFS, a distributed file system that uses RonDB to store metadata and small files. We fixed all issues found in these extended tests and any other problems found in the last couple of months.

Monday, January 09, 2023

RonDB News

 The RonDB team has been busy in development in 2022. Now is the time to start releasing things. There are 5 things that we are planning to release in Q1 2023.

RonDB 21.04.9: A new version of RonDB with a few new features required by the Hopsworks 3.1 release and a number of bug fixes. This is released today and will be described in a separate blog.

RonDB 22.10.0: This is a new Long-Term Support version (LTS) that will be maintained until 2025 at least. It is also released today. It has a number of new features on top of RonDB 21.04 of which the most important one is supporting variable sized disk columns which makes it much more interesting to use RonDB with large data sets. More on this feature in a separate blog post.

In addition RonDB 22.10.0 is updated to be based on MySQL 8.0.31, RonDB 21.04 is based on MySQL 8.0.23. I will post a separate blog more about the content of RonDB 22.10.0.

The release content is shown in detail in the release notes and new features chapters in the RonDB docs.

We are going to release very soon a completely revamped version of RonDB Docker using Docker Compose. This is intended to support developing applications on top of RonDB in your local development environment. This is used by RonDB developers to develop new features in RonDB, but is also very useful to develop any type of applications on top of RonDB using any of the numerous APIs by which you can connect to RonDB.

We are also close to finishing up the first version of our completely new RonDB REST API that will have the possibility to issue REST API requests towards RonDB as well as the same queries using gRPC calls. In the first version it will support primary key lookups and batched key lookups. Batched key lookups are very useful in some Feature Store applications where it is necessary to read hundreds of rows in RonDB for ranking query results. Our plan is to further develop this REST API service such that it can also be used efficiently in multi- tenant setups enabling the use of RonDB in Serverless applications.

Finally we have completed the development phase and test phase of RonDB Reconfiguration in Hopsworks cloud using AWS. Hopsworks cloud is implemented using Amplify in AWS. So the Hopsworks cloud service is handled by Amplify even if the actual Hopsworks cluster is running in GCP or Azure. RonDB Reconfiguration means that you can start creating a Hopsworks cluster with 2 Data node VMs with 8 VCPUs and 64 GB of memory with 2 MySQL Server VMs using 16 VCPUs. When you see that this cluster is required to grow you can simply tell the Hopsworks UI that you want e.g. 3 Data node VMs with 16 VCPUs and 128 GB of memory each and 3 MySQL Server VMs with 32 VCPUs each. The Hopsworks cloud service will then reconfigure the cluster as an online operation. No downtime will happen during the reconfiguration. There might be some queries that gets temporary errors, but those can simply be retried.

The Hopsworks cloud applications uses virtual service names through Consul, this means that the services using the MySQL service will automatically use the new MySQL Servers as they come online and will use the MySQL servers in a round-robin fashion.

It is possible to scale data node VM sizes upwards, we currently don't support scaling sizes downwards. It is possible to scale up and down the number of replicas between 1 and 3. The number of MySQL Servers can be increased by one and decreased and the size of the MySQL Server VMs can go both upwards and downwards. At the moment we don't allow adding more Node Groups of data nodes as an online operation. This requires an offline change.

This reconfiguration feature is going to be integrated into Hopsworks cloud in the near future.