NoSQL Zone is brought to you in partnership with:

Baxter Denney is Director of Marketing at Couchbase. Couchbase is the NoSQL leader, with production deployments at AOL, Deutsche Post, NTT Docomo, Salesforce.com, Starbucks, Zynga, and hundreds of other global enterprises. Couchbase Server, our NoSQL database offering, delivers a more scalable, high-performance and cost-effective approach to data management than relational database technology. It is particularly well suited for storing the data behind web applications deployed on modern virtualized or cloud infrastructures. Baxter has posted 4 posts at DZone. You can read more from them at their website. View Full User Profile

Loading JSON Data in Couchbase

11.18.2012
| 5984 views |
  • submit to reddit

 Editor's Note: This post was originally written by Don Pinto at the Couchbase blog. 

If you're writing a web application, you're probably already familiar with JSON documents. Couchbase supports JSON documents and sooner or later you will need to import some JSON documents into Couchbase Server

But just because you inserted data into Couchbase doesn’t mean that it goes directly to disk. Your data will first be inserted into the in-memory object managed cache and later in the background written to the disk asynchronously - decoupled completely from your action.

But what tools does a developer have to get a bunch of JSON data into Couchbase? This blog describes the cbdocloader tool in more detail. It saved me a ton of time by allowing me to import an entire Vancouver tree dataset that I was playing with.

Using cbdocloader

Following are the different command line parameters for the cbdocloader tool :
/opt/couchbase/bin/tools/cbdocloader -u Administrator -p password -n 10.3.2.54:8091 -b bucket_zip -s 10 output

where  
-s denotes the RAM quota in MB. This is an optional parameter (100 MB by default)
-n is the node ip address
-b the bucket name (If the bucket does not exist, an error will be thrown)
-u username
-p password

The Vancouver Tree Dataset

The City of Vancouver added a new dataset of street trees to the city’s open data catalog. This dataset includes a full address listing of all boulevard trees on the streets of Vancouver, along with the tree type and other characteristics.

Each JSON file in the dataset contains information for all the trees in a particular area. Using a simple python script, we split each JSON into multiple files to produce one JSON file per tree. We then loaded the data into Couchbase using the cbdocloader tool.

Loading the individual JSON files into Couchbase

The source documents fed into cbdocloader can be in a particular directory or in .zip format.

cbdocloader to load JSON documents in a folder: /opt/couchbase/bin/tools/cbdocloader -u Administrator -p password -n 10.3.2.54:8091 -b bucket -s 1000 output

cbdocloader to load a zipped folder (that contains json documents): /opt/couchbase/bin/tools/cbdocloader -u Administrator -p password -n 10.3.2.54:8091 -b bucket_zip -s 1000 output.zip

Interesting Data Facts

So can you guess how many trees are in the Vancouver Tree dataset?
Click here for the answer. Clue: It is the item count in the bucket shown.

Do you know which Vancouver neighborhood has the tallest tree in the city?

Now that you have loaded the data into Couchbase, try to write a simple view to figure out the answer. We will revisit this question in our view blog series so stay tuned folks!

---
Thanks to Abhinav for putting the screenshots together.

Published at DZone with permission of its author, Baxter Denney. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)