I specialise MySQL Server performance as well as in performance of application stacks using MySQL, especially LAMP. Web sites handling millions of visitors a day dealing with terabytes of data and hundreds of servers is king of applications I love the most. Peter is a DZone MVB and is not an employee of DZone and has posted 229 posts at DZone. You can read more from them at their website. View Full User Profile

MongoDB Approach to Availability

05.04.2010
| 7003 views |
  • submit to reddit

Another thing I find interesting about MongoDB is its approach to Durability, Data Consistency and Availability. It is very relaxed and will not work for some applications but for others it can be usable in current form. Let me explain some concepts and compare it to technologies in MySQL space.

First I think MongoDB is best compared no to MySQL Server but MySQL Cluster, especially in newer versions which implement “sharding”. Same as commit to NDB Storage engine does not normally mean commit to disk, but rather commit to network it does not mean commit to disk with MongoDB, furthermore MongoDB uses Asynchronous replication, meaning it may take some time before data will be at more than one node. You can also use getLastError() to ensure data is propagated to the slave. So you can see it as a hybrid between MySQL Cluster and innodb_flush_log_at_trx_commit=2 mode. The second difference of course the fact MongoDB is not crash safe – similar to MyISAM database will need to be repaired if it crashes. Still I find behavior somewhat similar – you’re not expected to run MySQL Cluster without replication, MongoDB is practically the same.

Second – if we look at Replication Sets we find them very similar to MySQL Cluster though designed to work with Wide area network and so Async replication. There is voting required to pick the master node in case of node failure and at least 3 servers is recommended, where you can have some voting servers only cast their votes and hold no data. The other different is there is only one master rather than multiple. This is because doing master with asynchronous replication requires conflict resolution which can be tricky in general sense and MongoDB wants simplicity of operation for developers and administration.

Third if we look at how failover happens – same with NDB (native API) it is handled on driver level. When you connect to replication set you connect to set of server not one of them and if one server fails driver fails over to different master. Things are again tuned to deal with Asynchronous Replication. Consistency is maintained but at expense of certain changes may be thrown away/ “rolled back” in case of fail over.

This approach is not as clean as best possible “no committed data loss with almost instant fail over” but It makes sense for large number of applications. In fact using MySQL Replication for failover we’re operating with kind of similar situation, just with a lot less automation.

The good question of course is how robust these features are in MongoDB – many of them are new and Replication Sets are in development still. It may take a time for them to stabilize as well as later develop tools around them. How to check if 2 MongoDB nodes are indeed in sync ? How to do Hot Backups with point in time recovery ? These and many similar questions need to be answered and bugs worked out. One good example of early stage of MongoDB replication could be a bug mentioned during presentation today with replication breaking if time on master server is changed (MongoDB uses timestamps to identify events in replication log). It was just fixed last month I understood.

At the same time many things, including replication are a lot more simply with MongoDB and there is a lot less of old baggage so I hope it will be able to stabilize and mature quickly.

References
Published at DZone with permission of Peter Zaitsev, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Michael Eric replied on Wed, 2012/09/26 - 3:59pm

MySQL sold out to Oracle, indirectly, possibly unknowingly. But it was still a sell out. Oracle will ruin in, not out of meanness or evil (though it is most definitely an evil multinational, also funding the same as Bill Gates, literally: let’s figure out how to get genetically modified mosquitos to vaccinate the unwitting masses) but out of simple association. We know Oracle by its fruits already. So MySQL is on its way out, and this is an inevitability if you ask anyone with a passion for and understanding of technology.

ubuntu

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.