Thursday, July 28, 2022

New stable release of RonDB, RonDB 21.04.8

Today we released a new version of RonDB 21.04, the stable release series of RonDB. RonDB 21.04.8 fixes a few critical bugs and two new features. See the docs for more details of this released version.

Make it possible to use IPv4 sockets between ndbmtd and API nodes

In MySQL NDB Cluster all sockets have been converted to use IPv6 format even when IPv4 sockets are used. This led to MySQL NDB Cluster no longer being able to interact with device drivers that only works using IPv4 sockets. This is the case for Dolphin SuperSockets.

Dolphin SuperSockets makes it possible to use extreme low latency HW in connecting the nodes in a cluster to improve latency significantly. This feature makes it possible for RonDB 21.04.8 to make use of interconnect cards from Dolphin using the Dolphin SuperSockets. RonDB has been tested and benchmarked using Dolphin SuperSockets. We will soon release a benchmark report of this.

Two new ndbinfo tables to check memory usage

RonDB is now used by app.hopsworks.ai, a Serverless Feature Store. This means that thousands of users can share RonDB. To ensure this multi-tenant usage of RonDB is working we have introduced two new ndbinfo tables that makes it possible to track exactly how much memory a specific user is using. A user in Hopsworks is mapped to a project and a project uses its own database in RonDB. Thus those two new tables makes it possible to implement quotas both on user level and on Feature Group level.

Two new ndbinfo tables are created, ndb$table_map and ndb$table_memory_usage. The ndb$table_memory_usage lists four properties for all table replicas, in_memory_bytes (the number of bytes used by a table fragment replica in DataMemory), free_in_memory_bytes (the number of bytes free of the previous, these bytes are always in the variable sized part), disk_memory_bytes (the number of bytes in the disk columns, essentially the number of extents allocated to the table fragment replica times the size of the extents in the tablespace), free_disk_memory_bytes (number of bytes free in the disk memory for disk columns).

Since each table fragment replica provides one row we will use a GROUP BY on table id and fragment id and the MAX of those columns to ensure we only have one row per table fragment.

We want to provide the memory usage in-memory and in disk memory per table or per database. However a table in RonDB is spread out in several tables. There are four places a table can use memory. First the table itself uses memory for rows and for a hash index, when disk columns are used this table also makes use of disk memory. Second there are ordered indexes that use memory for the index information. Thirdly there are unique indexes that use memory for rows in the unique index (a unique index is simply a table with unique key as primary key and primary key as columns) and the hash index for the unique index table. This table is not necessarily colocated with the table itself. Finally there is also BLOB tables that can contain hash index, row storage and even disk memory usage.

The user isn't particularly interested in this level of detail, so we want to display information about memory usage for tables and databases that the user sees. Thus we have to gather data for this, the tool to gather the data is the new ndbinfo table ndb$table_map, this table lists the table name and database name provided the table id, the table id can be the table id of a table, an ordered index, a unique index or a BLOB table, but will always present the name of the actual table defined by the user, not the name of the index table or BLOB table.

Using those two tables we create two ndbinfo views, the table_memory_usage listing the database name and table name and the above 4 properties for each table in the cluster. The second view, database\_memory\_usage lists the database name and the 4 properties summed over all table fragments in all tables created by RonDB for the user based on the BLOBs and indexes.

To make things a bit more efficient we keep track of all ordered indexes attached to a table internally in RonDB. Thus ndb$table_memory_usage will list memory usage of tables plus the ordered indexes on the table, there will be no rows presenting memory usage of an ordered index.

These two tables makes it easy for users to see how much memory they are using in a certain table or database. This is useful in managing a RonDB cluster.

No comments: