When to use Hadoop, HBase, Hive and Pig?

 What are the benefits of using either Hadoop or HBase or Hive ?

Use of Hive, Hbase and Pig w.r.t. my real time experience in different projects.

Hive is used mostly for:

  • Analytics purpose where you need to do analysis on history data

  • Generating business reports based on certain columns

  • Efficiently managing the data together with metadata information

  • Joining tables on certain columns which are frequently used by using bucketing concept

  • Efficient Storing and querying using partitioning concept

  • Not useful for transaction/row level operations like update, delete, etc.

Pig is mostly used for:

  • Frequent data analysis on huge data

  • Generating aggregated values/counts on huge data

  • Generating enterprise level key performance indicators very frequently

Hbase is mostly used:

  • For real time processing of data

  • For efficiently managing Complex and nested schema

  • For real time querying and faster result

  • For easy Scalability with columns

  • Useful for transaction/row level operations like update, delete, etc.