Hadoop Developer Self Learning Outline

Hadoop Developer Self Learning Outline

Learning Hadoop is not tough but it require patience.
I want to learn hadoop but from where should I start?
Are you in search of such outline so here we have draft for hadoop learning outline

hadoop quiz presents a learning approach for beginner.
Prepare according to below outline and no one will stop you to become a HADOOPER

Understanding Big Data

3V (Volume-Variety-Velocity) characteristics
Structured and Unstructured Data
Application and use cases of Big Data
Limitations of traditional large Scale systems

Hadoop Introduction

Hadoop history and concepts
Ecosystem
Distributions
High level architecture

These topics are covered in two part, Kindly refer below section.

Hadoop Introduction | Hadoop Developer Self Learning
Hadoop Introduction | Hadoop Developer Self Learning
HDFS

Concepts (Distributed storage,horizontal scaling, replication, rack awareness)
Architecture
Namenode (function, storage, file system meta-data, and block reports)
Secondary namenode
Data node
Configuration files
Single node and multi node installation
Communications / heart-beats
Block manager / balancer
Health check / safemode
read / write path
Navigating HDFS UI
Command-line interaction with HDFS
File systems abstractions
Reading / writing files using Java API
Latest in HDFS
Namenode HA and Federation

MapReduce

MapReduce concepts
Daemons: jobtracker / tasktracker
Phases: driver, mapper, shuffle/sort, and reducer
First MapReduce job
MapReduce Programs ( Word Count,Word Co-Occurence,Average Word Lenth,Inverted Index programs)
MapReduce UI walk through
Counters
Distributed cache
Combiners
Partitioners
MapReduce configuration
Job config
MR types and formats
Sorting
Optimizing MapReduce
YARN Introduction

Hive

Hive introduction
Environment and configuration
Hive tables and metadata
HiveQL(DDL & DML Operations)
External vs Managed Tables
Partitions & Buckets
User Defined Functions
Json & Regex Serde

Pig

Pig Basics, Loading data files
Pig versus MapReduce
Data Types
Pig Latin language Constructs (LOAD, STORE, DUMP, SPLIT etc)
User Defined Functions

Sqoop

Sqoop Basics
Importing and Exporting data from using RDBMS
Hands On Exercises – Import and Export

Flume

Introduction to Flume
Flume source,channel,sink and agents
Flume Examples

Oozie

Introduction to Oozie
Oozie Workflow
Deploy and Run sample Oozie Workflow

NoSQL

Introduction to NoSQL
Different types of NoSQL databases (Key Value, Columnar, Document, Graph)
Mongo DB and Neo4J Introduction

HBase

Introduction to HBase
Architecture
Configuration
HBase versus RDBMS
HBase shell
HBase Java API
Splits and compaction
Read path / write path
Schema design

Hadoop PoC

Web Log Analysis (Small POC)
Twitter Analysis (Small POC)
Hadoop Usecases

get your hand dirty in hadoop framework
A hadoop Blog: https://hadoopquiz.blogspot.in/
FB Page: https://www.facebook.com/hadoopquiz

Hadoop Quiz

Hadoop Developer Self Learning Outline

Post a Comment