MongoDB: Replication Lag, Network Partition
Replication: when everything is working, it seems the easiest thing, but
the challenge comes when we have failures. This is something that
MongoDB, being a database that is out for some years, shows us very
well.
If you look at MongoDB's replication page, there is a section called "Rollback". This applies when there is a replication lag - your secondary is behind the primary - in conjunction with a network partition. It may look like, but it's not a scenario that uncommon.
What happens then? Since the primary is ahead of the secondary, if it wants to rejoin the cluster, it needs to somehow "resync" to the current state. This means that operations applied previously but not replicated (due to the lag) will need to be rolled back. First, this process in MongoDB is manual.
But what happens then if your secondary was behind a large amount of data - more precisely, 300Mb or more? MongoDB does not have the capability of rolling back and manual intervention is required to recover that node.
Read for yourself:
Published at DZone with permission of Rodrigo De Castro, author and DZone MVB. (source)If you look at MongoDB's replication page, there is a section called "Rollback". This applies when there is a replication lag - your secondary is behind the primary - in conjunction with a network partition. It may look like, but it's not a scenario that uncommon.
What happens then? Since the primary is ahead of the secondary, if it wants to rejoin the cluster, it needs to somehow "resync" to the current state. This means that operations applied previously but not replicated (due to the lag) will need to be rolled back. First, this process in MongoDB is manual.
But what happens then if your secondary was behind a large amount of data - more precisely, 300Mb or more? MongoDB does not have the capability of rolling back and manual intervention is required to recover that node.
Read for yourself:
Warning: A mongod instance will not rollback more than 300 megabytes of data. If your system needs to rollback more than 300 MB, you will need to manually intervene to recover this data.
Although it supports asynchronous
replication (the eventual consistency), it does not solve these
resynchronization problems automatically. Maybe it was just deemed as
not very common and low priority compared to other features, but can you
imagine the pain if you need to go through this process?
Source: http://docs.mongodb.org/manual/core/replication/
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)





