NoSQL Zone is brought to you in partnership with:

Kristina Chodorow is a core contributor to MongoDB. She has written several O'Reilly books (MongoDB: The Definitive Guide, Scaling MongoDB, and 50 Tips and Tricks for MongoDB Developers) and has given talks at conferences around the world, including OSCON, FOSDEM, Latinoware, TEK·X, and YAPC. Her Twitter handle is @kchodorow. Kristina is a DZone MVB and is not an employee of DZone and has posted 50 posts at DZone. You can read more from them at their website. View Full User Profile

So Many Ways to Start Your Mongo

04.11.2012
| 11085 views |
  • submit to reddit

Starting up a vanilla MongoDB instance is super easy, it just needs a port it can listen on and a directory where it can save your info. By default, Mongo listens on port 27017, which should work fine (it’s not a very commonly used port). We’ll create a new directory for database files:

$ mkdir -p ~/dbs/mydb # -p creates parent directories if they don't exist

And then start up our database:

$ cd 
$ bin/mongod --dbpath ~/dbs/mydb

…and you’ll see a bunch of output:

$ bin/mongod --dbpath ~/dbs/mydb
Fri Apr 23 11:59:07 Mongo DB : starting : pid = 9831 port = 27017 dbpath = /data/db/ master = 0 slave = 0  32-bit 
 
** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data
**       see http://blog.mongodb.org/post/137788967/32-bit-limitations for more
 
Fri Apr 23 11:59:07 db version v1.5.1-pre-, pdfile version 4.5
Fri Apr 23 11:59:07 git version: f86d93fd949777d5fbe00bf9784ec0947d6e75b9
Fri Apr 23 11:59:07 sys info: Linux ubuntu 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:54:29 UTC 2009 i686 BOOST_LIB_VERSION=1_38
Fri Apr 23 11:59:07 waiting for connections on port 27017
Fri Apr 23 11:59:07 web admin interface listening on port 28017

Now, Mongo will “freeze” like this, which confuses some people. Don’t worry, it’s just waiting for requests. You’re all set to go.

Set up a slave, Maeve

As we’re running master and slave on the same machine, they’ll need separate ports. We’ll use port 10000 for the master and 20000 for the slave. We also need separate directories for data, so we’ll create those:

$ mkdir ~/dbs/master ~/dbs/slave

Now we start the master database:

$ bin/mongod --master --port 10000 --dbpath ~/dbs/master

And then the slave, in a different terminal:

$ bin/mongod --slave --port 20000 --dbpath ~/dbs/slave --source localhost:10000

The “source” option specifies where the master is that the slave should replicate data from.

Now, if we want to add another slave, we need to go though the herculean effort of choosing a port and creating a new directory:

$ mkdir ~/dbs/slave2
$ bin/mongod --slave --port 20001 --dbpath ~/dbs/slave2 --source localhost:10000

Tada! Two slaves, one master. For more information on master-slave, see the core docs on it and my previous post.

This example puts the master server and slave server on the same machine, but people generally have a master on one machine and a slave on another. It works fine to put them on a single machine, it just defeats the point of a bit.

Get auto-failover… Rover

Okay, so there aren’t many people named Rover, but you come up with a rhyme for “auto-failover” (I tried “replica”, too).

Replica pairs are cool because it’s like master-slave, but you get automatic failover: if the master becomes unavailable, the slave will become a master. So, it’s basically the same as master-slave, but the servers know about each other and there is, optionally, an arbiter server that doesn’t do anything other than resolve “disputes” over who is master.

When could the arbiter come it in handy? Suppose the master’s network cable is pulled. The server still thinks it’s master, but no one else knows it’s there. The slave becomes master and the rest of the world goes along happily. When the master’s network cable gets plugged back in, now both servers think they’re master! In this case, the arbiter steps in and gently informs the master who’s behind in the times that he is now a slave.

You don’t have to set up an arbiter, but we will since it’s good practice:

$ mkdir ~/dbs/arbiter ~/dbs/replica1 ~/dbs/replica2
$ bin/mongod --port 50000 --dbpath ~/dbs/arbiter

Now, in separate terminals, you start each of the replicas:

$ bin/mongod --port 60000 --dbpath ~/dbs/replica1 --pairwith localhost:60001 --arbiter localhost:50000

And then the other one:

$ bin/mongod --port 60001 --dbpath ~/dbs/replica2 --pairwith localhost:60000 --arbiter localhost:50000

After they’ve been running for a bit, try killing (Ctrl-C) one, then restarting it, then killing the other one, back and forth.

For more information on replica pairs, see the core docs.

What’s this? Replica pairs are evolving! *voop* *voop* *voop*

Replica pairs have evolved into… replica sets! Well, okay, they haven’t yet, but they’re coming soon. Then you’ll be able to have an arbitrary number of servers in the auto-failover ring.

Make a new cluster, Buster

For the grand finale, sharding. Sharding is how you distribute data with Mongo. If you don’t know what sharding is, check out my previous post explaining how it works.

First of all, download the latest 1.5.x nightly build from the website. Sharding is changing rapidly, you want the latest and greatest here, not stable.

We’re going to be creating a three-node cluster. So, same as ever, create your database directories. We want one directory for the cluster configuration and three directories for our shards (nodes):

$ mkdir ~/dbs/config ~/dbs/shard1 ~/dbs/shard2 ~/dbs/shard3

The config server keeps track of what’s where, so we need to start that up first:

$ bin/mongod --configsvr --port 70000 --dbpath ~/dbs/config

The mongos is just a request router that runs on top of the config server. It doesn’t even need a data directory, we just tell it where to look for the configuration:

$ bin/mongos --configdb localhost:70000

Note the “s”: the router is called “mongos”, not “mongod”. We haven’t specified a port for it, so it’ll listen on the default port (27017).

Okay! Now, we need to set up our shards. Start these each up in separate terminals:

$ bin/mongod --shardsvr --port 71000 --dbpath ~/dbs/shard1
$ bin/mongod --shardsvr --port 71001 --dbpath ~/dbs/shard2
$ bin/mongod --shardsvr --port 71002 --dbpath ~/dbs/shard3

mongos doesn’t actually know about the shards yet, you need to tell it to add these servers to the cluster. The easiest way is to fire up a mongo shell:

$ bin/mongo
MongoDB shell version: 1.5.1-pre-
url: test
connecting to: test
type "help" for help
>

Now, we add each shard to the cluster:

> db = connect("localhost:70000/admin");
connecting to: localhost:70000
admin
> db.runCommand({addshard : "localhost:71000", allowLocal : true})
{
    "added" : "localhost:71000",
    "ok" : 1
}
> db.runCommand({addshard : "localhost:71001", allowLocal : true})
{
    "added" : "localhost:71001",
    "ok" : 1
}
> db.runCommand({addshard : "localhost:71002", allowLocal : true})
{
    "added" : "localhost:71002",
    "ok" : 1
}
>

mongos expects shards to be on remote machines and by default won’t allow you to add local shards (i.e., shards with “localhost” in the name). Since we’re just playing around, we specify “allowLocal” to override this behavior. (Note that “addshard” IS NOT camel-case, and allowLocal IS camel-case, because we’re consistent like that.)

Congratulations, you’re running a distributed database!

What do you do now? Well, use it just like a normal database! Connect to “localhost:27017″ and proceed normally (or, as normally as possible… please report any bugs to our bugtracker!). Try the tutorial (since you’ve already got the shell open) or connect through your favorite driver and play around.

Connecting to mongos should be an identical experience to connecting to a normal Mongo server. Behind the scenes, it splits up your requests/data across the shards so you can concentrate on making your application, not scaling it.

P.S. Obviously, this example setup is full of single points of failure, but that’s completely avoidable. I can go over how to set up distributed MongoDB with zero single points of failure in a follow-up post, if people are interested.

 

END

Published at DZone with permission of Kristina Chodorow, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Brad Knowles replied on Fri, 2012/04/20 - 12:23pm

The current version of MongoDB is 2.0.4 (see http://www.mongodb.org/downloads), and it is not yet April 23, as mentioned in some of the quoted lines above.  Given the reference to wanting the "latest version of 1.5.x", I would assume that this article actually dates back to some time in 2010, not 2012.

A more up-to-date article would be appreciated.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.