NoSQL Zone is brought to you in partnership with:

Kristina Chodorow is a core contributor to MongoDB. She has written several O'Reilly books (MongoDB: The Definitive Guide, Scaling MongoDB, and 50 Tips and Tricks for MongoDB Developers) and has given talks at conferences around the world, including OSCON, FOSDEM, Latinoware, TEK·X, and YAPC. Her Twitter handle is @kchodorow. Kristina is a DZone MVB and is not an employee of DZone and has posted 52 posts at DZone. You can read more from them at their website. View Full User Profile

MapReduce Star Trek Fan Fiction

04.26.2012
| 6485 views |
  • submit to reddit

 The following is an early post by Kristina Chodorow, one of our most creative MVBs, and it should be interesting to the NoSQL crowd.

MapReduce is really cool, useful, and powerful, but a lot of people find it hard to wrap their heads around. This post is a fairly silly, non-technical explanation using Star Trek.

The Enterprise found a new planet, as it tends to do.

Kirk wanted to beam down immediately and start surveying the planet but Spock told him to wait a moment. “It usually takes us one hour to survey a planet, correct Captain?  In less than 5 minutes, I can calculate whether the chance of encountering friendly alien females outweighs the risk of attack by brain-eating monsters.”

“Interesting idea, Spock,” said Kirk.  ”Go ahead.”

The Data

“Logically,” thought Spock, “if we can survey a whole planet in one hour, we can survey 1/16th of a planet in 3.75 minutes.”  Spock divided the planet into 16 equal-size pieces and summoned 16 red shirts.

“You’ll be beamed down to the surface of the planet with this special data collection device called an ‘emitter.’  If you see a brain-eating monster, you press the “brain-eating monster” button on your emitter.  If you see an attractive female alien, you press the “hot alien chick” button.  Press either, neither, or both buttons, as your situation requires.”

The Map Step

The 16 red shirts were beamed down to the 16 parts of the planet.  As they found things, they would press the buttons on their emitter.

Back on the Enterprise, Spock started getting lots of data pairs that looked like:

| type                 | location |
|----------------------|----------|
| Brain-eating monster | 2        |
| Hot alien chick      | 7        |
| Brain-eating monster | 14       |
| Brain-eating monster | 7        |

The Reduce Step

“Computer,” Spock said.  ”Initialize a counter to 0 for each new type you get.  Then, for every subsequent data pair with the same type, increment that counter.”

“I dinnae understand,” said Scotty.  ”What’s that, then?”

“I basically told the computer to initialize two variables, ‘Brain-eating monster’ and ‘Hot alien chick’, setting them both to zero.  Every time the computer gets a ‘Brain-eating monster’ emit, it increments the ‘Brain-eating monster’ variable.  Every time it gets a ‘Hot alien chick’ emit, it increments the ‘Hot alien chick’ variable.

“Ah, I see,” said Scotty.  ”But don’t you lose the location information?”

“Yes,” replied Spock.  ”But I don’t actually care about location for this readout.  If I wanted the location, I could give the computer a slightly more complicated algorithm, but right now I just want the count.”

The Result

After 3.75 minutes, Spock beamed up the red shirts who were still alive and presented to Kirk: “There are brain-eating monsters on 7/8ths of the planet, Captain.  1/16 of the planet has hot alien chicks.”

“Excellent work Spock,” Kirk says.  ”Let’s boldly go somewhere else.”

And so they did.

Captain’s log, star date 1419.7 (aka a summary of what we did)

  1. Goal – To generate a report on a planet.
  2. Data – 16 pieces of land with various attributes. Each piece of land could be represented by a JSON object such as:
{
    "location" : 5
    "contains" : ["Brain-eating monsters", "rocks", "poison gas"]
}

     3.  Map – Send attributes for each piece of data back to the processor. In JSON, each emit would look something like:

{
    "Brain-eating monsters" : 5
}

     4.  Reduce – Sum up the data, grouping by type

     5.  Result – How much of each attribute is on the planet

 

Published at DZone with permission of Kristina Chodorow, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)