Hadoop Developer Self Learning Outline
Hadoop Developer Self Learning Outline
Learning Hadoop is not tough but it require patience.
I want to learn hadoop but from where should I start?
Are you in search of such outline so here we have draft for hadoop learning outline
hadoop quiz presents a learning approach for beginner.
Prepare according to below outline and no one will stop you to become a HADOOPER
Understanding Big Data
Hadoop Introduction
Hadoop Introduction | Hadoop Developer Self Learning
Hadoop Introduction | Hadoop Developer Self Learning
HDFS
MapReduce
Hive
Pig
Sqoop
Flume
Oozie
NoSQL
HBase
Hadoop PoC
A hadoop Blog: https://hadoopquiz.blogspot.in/
FB Page: https://www.facebook.com/hadoopquiz
Learning Hadoop is not tough but it require patience.
I want to learn hadoop but from where should I start?
Are you in search of such outline so here we have draft for hadoop learning outline
hadoop quiz presents a learning approach for beginner.
Prepare according to below outline and no one will stop you to become a HADOOPER
Understanding Big Data
- 3V (Volume-Variety-Velocity) characteristics
- Structured and Unstructured Data
- Application and use cases of Big Data
- Limitations of traditional large Scale systems
Hadoop Introduction
- Hadoop history and concepts
- Ecosystem
- Distributions
- High level architecture
Hadoop Introduction | Hadoop Developer Self Learning
Hadoop Introduction | Hadoop Developer Self Learning
HDFS
- Concepts (Distributed storage,horizontal scaling, replication, rack awareness)
- Architecture
- Namenode (function, storage, file system meta-data, and block reports)
- Secondary namenode
- Data node
- Configuration files
- Single node and multi node installation
- Communications / heart-beats
- Block manager / balancer
- Health check / safemode
- read / write path
- Navigating HDFS UI
- Command-line interaction with HDFS
- File systems abstractions
- Reading / writing files using Java API
- Latest in HDFS
- Namenode HA and Federation
MapReduce
- MapReduce concepts
- Daemons: jobtracker / tasktracker
- Phases: driver, mapper, shuffle/sort, and reducer
- First MapReduce job
- MapReduce Programs ( Word Count,Word Co-Occurence,Average Word Lenth,Inverted Index programs)
- MapReduce UI walk through
- Counters
- Distributed cache
- Combiners
- Partitioners
- MapReduce configuration
- Job config
- MR types and formats
- Sorting
- Optimizing MapReduce
- YARN Introduction
Hive
- Hive introduction
- Environment and configuration
- Hive tables and metadata
- HiveQL(DDL & DML Operations)
- External vs Managed Tables
- Partitions & Buckets
- User Defined Functions
- Json & Regex Serde
Pig
- Pig Basics, Loading data files
- Pig versus MapReduce
- Data Types
- Pig Latin language Constructs (LOAD, STORE, DUMP, SPLIT etc)
- User Defined Functions
Sqoop
- Sqoop Basics
- Importing and Exporting data from using RDBMS
- Hands On Exercises – Import and Export
Flume
- Introduction to Flume
- Flume source,channel,sink and agents
- Flume Examples
Oozie
- Introduction to Oozie
- Oozie Workflow
- Deploy and Run sample Oozie Workflow
NoSQL
- Introduction to NoSQL
- Different types of NoSQL databases (Key Value, Columnar, Document, Graph)
- Mongo DB and Neo4J Introduction
HBase
- Introduction to HBase
- Architecture
- Configuration
- HBase versus RDBMS
- HBase shell
- HBase Java API
- Splits and compaction
- Read path / write path
- Schema design
Hadoop PoC
- Web Log Analysis (Small POC)
- Twitter Analysis (Small POC)
- Hadoop Usecases
A hadoop Blog: https://hadoopquiz.blogspot.in/
FB Page: https://www.facebook.com/hadoopquiz
Post a Comment
image video quote pre code