Big Data/Analytics Zone is brought to you in partnership with:

Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2578 posts at DZone. You can read more from them at their website. View Full User Profile

Hadoop Study Reveals Usage Stats, Benefits, and Challenges

10.21.2010
| 10807 views |
  • submit to reddit
A new survey on Hadoop suggests that companies using the Apache project's utilities (which include Hadoop Commons, ZooKeeper, HDFS, Hive, MapReduce, etc.) are finding more uses for the open source software and bringing experimental Hadoop projects into production.  102 Hadoop users were surveyed in August by LaunchPad, who was commissioned by Karmasphere.  The study found that nearly 68% of Hadoop projects started in the experimental phase and within a year, 86% moved to active development or production.

The survey also suggests that organizations find the software more useful the longer they use it.  65% of the organizations who used Hadoop for more than a year wrote down more than three reasons for using it.  New users of Hadoop obviously had less knowledge of its usefulness.

These were the top three reasons mentioned for using Hadoop

  1. Mining data for improved Business Intelligence
  2. Reduces the cost of data analysis
  3. Log Analysis

Here were some of the main challenges respondents listed for using Hadoop:

  • Steep learning curve
  • Hiring qualified people
  • Low availability of good products and tooling
  • Not enough information on how to get started

The programming-related challenges included:

  1. Debugging Hadoop jobs
  2. Monitoring Hadoop jobs
  3. Insufficient information about Hadoop
  4. Availability of useful algorithms
  5. Writing Hadoop Jobs

Other areas of the survey found that Hadoop is usually introduced from the developers in an organization rather than management.  

Based  on certain survey questions, LaunchPad projects a 50-60% growth in Hadoop developers for organizations already using Hadoop.  They also expect Java to remain the primary language for Hadoop.  Usage of streaming and the Pig sub-project should remain constant.  Usage of Hive/SQL and Mahout are expected to jump.  Since all of the respondents were current Hadoop users, we can't be sure how many organizations try Hadoop and give up.