NoSQL Zone is brought to you in partnership with:

I am the founder and CEO of Data Geekery GmbH, located in Zurich, Switzerland. With our company, we have been selling database products and services around Java and SQL since 2013. Ever since my Master's studies at EPFL in 2006, I have been fascinated by the interaction of Java and SQL. Most of this experience I have obtained in the Swiss E-Banking field through various variants (JDBC, Hibernate, mostly with Oracle). I am happy to share this knowledge at various conferences, JUGs, in-house presentations and on our blog. Lukas is a DZone MVB and is not an employee of DZone and has posted 248 posts at DZone. You can read more from them at their website. View Full User Profile

MIT Prof. Stonebraker: “The Traditional RDBMS Wisdom is All Wrong”

08.28.2013
| 6482 views |
  • submit to reddit
A very interesting talk about the future of DBMS was recently given at EPFL by MIT Professor and VoltDB Co-founder and CTO Michael Stonebraker, who also gave us Ingres and Postgres. In a bit less than one hour, he explains his views with respect to the three main pillars of database management systems:

  • OLAP / Data warehouses
  • OLTP
  • Other types of data stores

As a NewSQL vendor also actively involved with H-Store, he is of course heavily yet refreshingly biased towards traditional RDBMS storage models being obsolete (an interesting fact is that Oracle Labs representative Eric Sedlar also attended the talk. One might think that the talk was a slighly FUD-dy move against a VoltDB competitor). Unlike what has come to be known as the NoSQL movement, NewSQL relies on similar relational theory / set theory as “traditional SQL”, including support for ACID and structured data.

His claims mainly include that:

  • OLAP / data warehouses will migrate to column-based data stores within 10 years. The traditional row-based data storage approach is dead, as row-based storage will never match column-based storage’s performance increase by factor 100x.
  • For OLTP, the race for the best data storage designs has not yet been decided, but there is a clear indication of classic models being “plain wrong” (according to Stonebraker), as only 4% of wall-clock time is spent on useful data processing, while the rest is occupied with buffer pools, locking, latching, recovery.
Image from Stonebraker's presentation depicting the amount of "useful" work performed by any RDBMS

Image from Stonebraker’s presentation depicting the amount of “useful” work performed by any RDBMS

I specifically recommend the OLTP part of his talk, as it shows how various new techniques could heavily increase performance of traditional RDBMS already today:

  • Most OLTP systems can afford to buy the amount of memory needed to keep data off the disk. This will remove the need for a buffer pool.
  • Single-threading would get rid of the latching overhead. H-Store and VoltDB statically divide shared memory among the cores, for instance. This is very important as latching gets worse and worse with the increasing amount of cores we have, today.
  • Dynamic locking is not really implemented in any popular RDBMS, but the market is uncertain, which workaround best implements concurrency control. In his opinion, MVCC is not going to do the trick in the long run.
  • ACIDness is something that even Jeff Dean from Google admits to miss, once it’s gone, as eventual consistency does not really keep its promise.
  • In a cluster, active-active consistency management can increase log throughput by factor 3x, compared to active-passive logging. (active-active = transaction is run on every node, active-passive = transaction is run only on the master node, the log is sent to all slave nodes)
  • And also, very importantly, anti-caching is a good technique when the in-memory format matches the disk format, as traditional RDBMS spend a substantial amount of time converting disk data formats (blocks, sectors) into memory formats (actual data).

The essence of Stonebraker’s talk is that the “elephants” who currently dominate the market are too slow to react to all the NewSQL vendors’ innovations. It is a very exciting time for a database professional (some refer to them as data geeks) to enter the market and publish new findings.

Another interesting thing to note is that SQL (call it NewSQL, OldSQL) will remain a dominant language for querying DBMS, both for column-stores as for row-stores. This is a strong statement for tools like jOOQ, which embrace SQL as a first-class citizen among programming languages.

See the complete talk by Michael Stonebraker here:

See Stonebraker's Talk here: http://slideshot.epfl.ch/play/suri_stonebraker

See Stonebraker’s Talk here: http://slideshot.epfl.ch/play/suri_stonebraker

Further reading:

Published at DZone with permission of Lukas Eder, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)