NoSQL Zone is brought to you in partnership with:

Dr. Michael Stonebraker, Adjunct Professor of Computer Science at MIT, is co-founder and CTO of VoltDB, Inc. Previously, Dr. Stonebraker was the main architect behind the Ingres and PostgreSQL databases, and was co-founder of Vertica Systems and Streambase Systems. He is a renowned researcher and prolific publisher on database systems topics. Mike is a DZone MVB and is not an employee of DZone and has posted 18 posts at DZone. You can read more from them at their website. View Full User Profile

To Flash or Not to Flash: That is the Question

06.24.2012
| 3938 views |
  • submit to reddit

I am often asked about the value of flash memory in OLTP database applications.  This blog post discusses flash technology in this context.  First, I discuss the future of flash in general; then I turn to flash (and other future storage technologies) in the context of a main memory DBMS, such as VoltDB.

The Future of Flash

Flash memory is clearly a “moving window”, since its price and performance are changing quickly.  Historically, flash could only be written a few thousand times, before it would “wear out” and have to be replaced.  This drawback seems to have been eliminated in higher end flash devices. 

Also, early flash technologies only came with a “block” footprint; i.e. it was written just like a disk, and was often called a solid state disk (SSD).  Now, flash also comes with a main memory interface.  With this latter interface, it looks like cheaper, slower and persistent main memory, and I will call this footprint persistent memory (PM).  In either case, there is wear-leveling software in the device, which tries to optimize physical device writes and performs “wear leveling”.

There is speculation among my hardware-knowledgeable friends about the longevity of current flash technology.  There are multiple potential future technologies (e.g., phase coded memory) that may replace flash in the medium term (5 -10 years).  The expectation of this future technology is that it will be cheaper and faster than flash with better wear characteristics.  So the technology “behind the curtain” may well change during the current decade.

As such, over the next few years, speed will increase, cost will decrease, wear characteristics will improve, and the fundamental technology may very well change.  Hence, I now turn to what can be done with this sort of technology.  Specifically, I discuss VoltDB in the content of PMs and SSDs

VoltDB's Use of PM

First, let me state the obvious.  Flash with a main memory footprint (PM) can be used by VoltDB as a “drop on”.  PM will allow VoltDB databases to be bigger and cheaper (but slower) than is possible with DRAM.  In fact, it will be possible to mix-and-match DRAM and PM.  In this case, VoltDB should move “colder” data onto PM, retaining DRAM for the hotter data.  As PM becomes popular among VoltDB customers, this will be a desirable optimization.

Depending on the price point and speed characteristics off into the future, this hybrid scheme may well be very appealing.  For example, suppose PM is 20X slower than DRAM and 20X cheaper.  If a customer has a skewed access pattern where 90% of the activity goes to 10% of the data, then the above hybrid will be quite attractive.

VoltDB's Use of SSD

You might immediately ask “Why not store VoltDB data on SSD, as opposed to PM?”  This strategy is not likely to work out well.  We would be forced to gather up main memory records into an SSD block and then write it.  SSD is not main-memory addressable; so VoltDB would be required to manage two kinds of storage, with manual movement of data between them. Effectively, VoltDB would cease being a main memory DBMS, and would turn into something else.  Moreover, blocking of records is expensive and the number of writes that can be performed, while much better than disk, is still not earthshaking.  For these reasons, think of flash PM, not flash SSD, to augment data storage in VoltDB.

So what is SSD useful for?  The answer is it can help with recovery from failures, and I now turn to this topic.  At the present time, VoltDB supports replication of tables to achieve high availability in a single cluster, and we’ve recently extended this capability to multiple clusters over a wide area network (WAN).  Hence, if a single node fails, then VoltDB seamlessly fails over to use replicas of objects on the dead node and continues operation.  When the dead node is restored, VoltDB seamlessly brings the original node up to date and restores the original configuration.  No human involvement is required during this process.  In fact, VoltDB supports K-safety, so K – 1 failures can be masked in this fashion.

Suppose a VoltDB user is running his database on a single cluster, then what happens if the power fails?  Obviously, main memories on all nodes in the cluster will lose their data unless the user has purchased some sort of backup power supply.  Therefore, the replication noted above does not mask a power failure.

To deal with power failures, each VoltDB node periodically writes a checkpoint to local disk.  This checkpoint records the exact state of the local database as of the completion of all transactions up to a specific point in time.  Hence, on disk, there is a transaction-consistent (but somewhat stale) copy of local data.  In addition, VoltDB carefully records on disk a command log of the transactions that have been run.  This log is of the form {(tranaction-id, parameters, timestamp)}.  To recover from a power failure, VoltDB restores the most recent checkpoint and then replays the command log to bring the database up to date.  Of course, this produces a transaction-consistent database with the effects of all committed transactions; however this will only occur once the power is restored.

Flash with a block footprint (SSD) can be used instead of disk to store checkpoints and the command log, noted above.  The return on investment (ROI) for moving in this direction will depend on the cost and performance of future flash, relative to rotating disk.  However, it should be noted that the overhead of current checkpointing and command logging is not dramatic, so the ROI may not be dramatic.

Summary

Flash technologies are evolving to serve a variety of high performance applications.  Although VoltDB was designed to use DRAM as its primary storage structure, its architecture also anticipates continued innovation in adjacent memory alternatives.  Many VoltDB users deploy SSD solutions to store checkpoint and command log data, improving the performance of logging and recovery processes relative to what can be achieved with spinning disks.  VoltDB is also very well positioned to capitalize on PM solutions when they become widely available and price-performance competitive.

Published at DZone with permission of Mike Stonebraker, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)