Monday, 28 July 2014

Hadoop Ecosystem - Projects

I discussed about hadoop ecosystem in my last post. Lets have a look at some of the more dynamic projects of the ecosystem.

Pig

Pig is a data flow language that provides high level abstraction to hadoop framework. It is a sequence of operations to extract / query the relevant data from HDFS. Pig provides a flexible way to handle data. It is a batch query language. The script written under Pig is converted into MapReduce program under the hood. This is taken care by the Pig interpreter. The generated MapReduce code can work on HDFS and retrieve the desired data. Pig was developed at Yahoo and is an open source project with a strong community.

Hive

Hive is a SQL - kind of language which provides high level abstraction to HDFS, just like Pig. Hive's SQL - kind of language is called HiveQL. It is heavily inspired by MySql. Hive was developed so that data analysts who wanted to use the hadoop platform but are not so well versed with writing MapReduce programs in java, can access the platform with SQL kind of querying. Hive was developed at Facebook and now is a open Source Apcahe project. Currently, Hive is one of the most dynamic project in hadoop ecosystem and is getting richer and richer.

HBase

HBase is a column - oriented database which sits on top of hadoop framework and provide a near real - time capability to the platform. HBase can work in tandem with Pig and Hive to provide a flexible interaction with hadoop. HBase is inspired by Google's BigTable and is another dynamic project under hadoop ecosystem.

Sqoop

Sqoop is short for Sql - to - Hadoop. It is a open-source project under Apache foundation. Sqoop is used to transfer data between HDFS and any RDBMS.

Zookeeper

It is another open-source project under Apache foundation. Zookeeper is distributed service coordinator. It provides primitives such as distributed locks that can be used for building distributed applications.

1 comment:

  1. Hadoop gives better solution for Big Data problems, Your article impressed me to take Hadoop Certification. Thanks for your motivation.
    Regards
    Big Data Hadoop Training in Chennai
    Best hadoop training institute in chennai

    ReplyDelete