NoSQL Zone is brought to you in partnership with:

Passionate about technology and startups. Have worked for the BBC in London, Livedoor.com in Japan, Cloudera in San Francisco and MailChannels in Vancouver, Canada. Currently reside in Vancouver where I'm working on building PaaS based on Cloud Foundry, called Stackato. I enjoy writing about technology, especially when it relates to interesting startups. Phil is a DZone MVB and is not an employee of DZone and has posted 21 posts at DZone. You can read more from them at their website. View Full User Profile

How to Quickly Launch a Cassandra Cluster on Amazon EC2

12.20.2012
| 2857 views |
  • submit to reddit

Curator's Note: This post is from 2011, so comment if you see any details that need updating!

Today's Workplace

If you have read my previous post, “Map-Reduce With Ruby Using Hadoop“, then you will know that firing up a Hadoop cluster is really simple when you use Whirr. Without even ssh’ing on the machines in the cloud you can start-up your cluster and interact with it. In this post I’ll show you that it is just as easy to fire up a Cassandra cluster on Amazon EC2.

Install Whirr

I will fly through the setup of Whirr quite quickly. All the commands you need are here, but if you want a more thorough explanation then see my other post, “Map-Reduce With Ruby Using Hadoop“.

I am assuming that you have Homebrew installed.

sudo brew update
sudo brew install maven
mkdir ~/src/cloudera
cd ~/src/cloudera
wget http://archive.cloudera.com/cdh/3/whirr-0.1.0+23.tar.gz
tar -xvzf whirr-0.1.0+23.tar.gz
cd whirr-0.1.0+23
mvn clean install
mvn package -Ppackage

Be patient with the above. There is a lot to install, so it will take some time. Maven installs a lot of dependencies if it is your first time using it.

The good news is that from here on you are setup to easily fire-up your Amazon EC2 cluster for Cassandra, or Hadoop if you choose.

Whirr Configuration File

We will need to make a configuration file for Whirr to tell it that we want to launch a Cassandra cluster with 3 nodes. If you are brave, patient and have the cash, then you could just as easily fire-up a 100 node cluster (leave a comment if you do – there may be prizes!).

You will need to create a cassandra.properties file with the following contents…

whirr.service-name=cassandra
whirr.cluster-name=mycassandracluster
whirr.instance-templates=3 cassandra
whirr.provider=ec2
whirr.identity=<YOUR_AMAZON_EC2_ACCESS_KEY_ID_GOES_HERE>
whirr.credential=<YOUR_AMAZON_EC2_SECRET_ACCESS_KEY_GOES_HERE>
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa

Replace the obvious fields with your Amazon EC2 Access Key ID and Amazon EC2 Secret Access Key.

Launch Your Cluster

Now you are ready to fire-up your Cassandra cluster. Simply use the following command and then be prepared to wait 5-10 minutes while Amazon builds your machines. This time is variable. Sometimes Amazon is quick, sometimes not so quick.

bin/whirr launch-cluster --config cassandra.properties


Launching mycassandracluster cluster
Configuring template
Starting 3 node(s)
Nodes started: [[id=us-east-1/i-13f25e7f, providerId=i-13f25e7f, tag=mycassandracluster, name=null, location=[id=us-east-1d, scope=ZONE, description=us-east-1d, parent=us-east-1], uri=null, imageId=us-east-1/ami-74f0061d, os=[name=null, family=amzn-linux, version=2010.11.1-beta, arch=paravirtual, is64Bit=true, description=amazon/amzn-ami-2010.11.1-beta.x86_64-ebs], userMetadata={}, state=RUNNING, privateAddresses=[10.204.99.163], publicAddresses=[50.16.155.106], hardware=[id=t1.micro, providerId=t1.micro, name=t1.micro, processors=[[cores=1.0, speed=1.0]], ram=630, volumes=[[id=vol-1657d47e, type=SAN, size=null, device=/dev/sda1, durable=true, isBootDevice=true]], supportsImage=hasRootDeviceType(ebs)]], [id=us-east-1/i-17f25e7b, providerId=i-17f25e7b, tag=mycassandracluster, name=null, location=[id=us-east-1d, scope=ZONE, description=us-east-1d, parent=us-east-1], uri=null, imageId=us-east-1/ami-74f0061d, os=[name=null, family=amzn-linux, version=2010.11.1-beta, arch=paravirtual, is64Bit=true, description=amazon/amzn-ami-2010.11.1-beta.x86_64-ebs], userMetadata={}, state=RUNNING, privateAddresses=[10.117.43.129], publicAddresses=[50.16.85.79], hardware=[id=t1.micro, providerId=t1.micro, name=t1.micro, processors=[[cores=1.0, speed=1.0]], ram=630, volumes=[[id=vol-1457d47c, type=SAN, size=null, device=/dev/sda1, durable=true, isBootDevice=true]], supportsImage=hasRootDeviceType(ebs)]], [id=us-east-1/i-11f25e7d, providerId=i-11f25e7d, tag=mycassandracluster, name=null, location=[id=us-east-1d, scope=ZONE, description=us-east-1d, parent=us-east-1], uri=null, imageId=us-east-1/ami-74f0061d, os=[name=null, family=amzn-linux, version=2010.11.1-beta, arch=paravirtual, is64Bit=true, description=amazon/amzn-ami-2010.11.1-beta.x86_64-ebs], userMetadata={}, state=RUNNING, privateAddresses=[10.117.46.170], publicAddresses=[184.73.100.203], hardware=[id=t1.micro, providerId=t1.micro, name=t1.micro, processors=[[cores=1.0, speed=1.0]], ram=630, volumes=[[id=vol-e857d480, type=SAN, size=null, device=/dev/sda1, durable=true, isBootDevice=true]], supportsImage=hasRootDeviceType(ebs)]]]
Authorizing firewall
Running configuration script
Completed launch of mycassandracluster
Started cluster of 3 instances
Cluster{instances=[Instance{roles=[cassandra], publicAddress=/50.16.85.79, privateAddress=/10.117.43.129}, Instance{roles=[cassandra], publicAddress=/50.16.155.106, privateAddress=/10.204.99.163}, Instance{roles=[cassandra], publicAddress=/184.73.100.203, privateAddress=/10.117.46.170}], configuration={}}

You now have your very own Cassandra cluster running in the cloud. Not so hard, hey!

Connect From Ruby

I will be following this post with step-by-step guide on how you can interact with your new cluster from your Ruby On Rails application. I recommend subscribing to the RSS feed to get updates to the blog.

Shutdown The Cluster

Here is how you can shutdown your cluster.

bin/whirr destroy-cluster --config cassandra.properties

Destroying mycassandracluster cluster
Cluster mycassandracluster destroyed

Conclusion

Whirr makes it very easy to start and stop a Cassandra cluster in the cloud without leaving the comfort of your laptop. What you do with that cluster is up to you, but I will be give you some ideas of what you could do in future posts.


Published at DZone with permission of Phil Whelan, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)