Thursday, August 21, 2008

Some food for thoughts: How to make use of new SSD devices

The hardware guys are presenting new storage devices called
SSD's based on flash memory. At the moment I think they are
about 3-4 times cheaper than DRAM memory and the gap seems
to be increasing. They're still far from the price of hard
drives but also here the gap seems to be closing.

So as I'm now an employee of Sun that actually puts together
systems with this type of HW in it I get questioned what I
as a DBMS developer can do with those devices.

First some comments on performance. These new devices will be
able to perform reads and writes of a few kilobytes large pages
in about 25-100 microseconds compared to hard drives which
takes about 3-10 milliseconds for the same thing.

An obvious use is obviously to use them to speed up database
logging, particularly in commit situations. However this
doesn't really require any significant changes to the SW
already out there. So I won't spend any more time on this use.

Another use is for MySQL Cluster. MySQL Cluster stores most data
in memory and can store non-indexed data on disk. So how can
SSD devices be used to improve this.

First some facts about performance of MySQL Cluster. In the data
node where the data actually resides it takes about 10
microseconds of processing time to perform a key lookup and a
scan has about 20 microseconds of start-up costs whereafter each
record takes 1-2 microseconds to fetch.

So now for the idea. Let's assume we'll use an SSD device as swap
memory. We would then purposely set the swap to be e.g. 10x
larger than the memory. For this to work we need to be able to
allocate memory from different swap pools, memory used for
transaction state and things like this we don't want swapped out
(working for Sun has an advantage since we can work with the OS
guys directly, but naturally I hope Linux developers also take the
same opportunity).

So during a key lookup we need to get one page from the hash index
and one page with the record in it. Guestimating a 90% hit rate in
the hash index and 80% hit rate on the data page we find that we
will about 0.3 swap misses per key lookup. If we assume 50
microseconds for this it means that mean key lookup will increase
from 10 microseconds to 25 microseconds. This should be
acceptable, given that we can increase data size by a factor of
about 10.

A similar analysis can be made for scans as well, but I'm lazy so
will leave it to you to perform :)

So given todays sizes of memories and SSD's it should be possible
to use systems with 64 GBytes of memory and 640 GB of SSD memory
and clustering 8 of those with replication gives us a main memory
based system for a reasonable price providing 2.5 TByte of user
data in a highly available system with high degrees of parallelism
in the system.


Anonymous said...

It sounds like Windows Vista's ReadyBoost feature.

Anonymous said...

You nailed it.
Essentially you are faking extra RAM using SSD.
Interesting use of swap.

It is not only MySQL databases which will benefit, the speed of memory vs disk will move disk to being not much more than a fast tape.

Anonymous said...
This comment has been removed by a blog administrator.
Unknown said...


This is Chris that used to maintain Challenger.

I'm at FusionIO now, and we've got a very fast SSDs (100K IOPs) that I'd like to discuss how to optimize for MySQL.

My new email address is (i you recall my last name). Drop me a note if you'd like to discuss this.


Term Paper said...

These new devices will be
able to perform reads and writes of a few kilobytes large pages
in about 25-100 microseconds compared to hard drives which
takes about 3-10 milliseconds for the same thing.

Web Design said...

Agree with one of the above comment sounds like Windows Vista's ReadyBoost feature.

Beth said...

Interesting concept, in theory it seems to work well no? It is very similar to Vista's ReadyBoost but with fundamental differences, when talking in milliseconds and microseconds but multiply your processes by 10 fold and you are saving a lot of time!

Ben said...

We are having a lot of run with some Fusion IO drives and MySQL 5.1 in InnoDB only configuration. The speed difference between platter and solid is remarkable.

However, would be interested if you have any ideas of optimising InnoDB? Especially in lowering the disk IO, since this database is using more than 1TB of disk ware every day...