-->

Table of contents

top 3 hadoop distributions

top 5 hadoop distributions

What is Hadoop? and Top Hadoop distributions

What is Hadoop? and Top Hadoop distributions

10:41:00 PM 10:41:22 PM

What is Hadoop?

Apache project for storing and processing large data sets
Open-source implementation of Google Big Data solutions
Components:
- HDFS (Hadoop Distributed File System)
- YARN (Yet Another Resource Negotiator)
- Data processing models (MapReduce, Impala, Tez, etc.)
- Underpinning tools (Pig, Hive, Sqoop, HBase, etc.)
Written in Java

Data storage evolution

1956 - HDD (Hard Disk Drive), now up to 6 TB

1983 - SDD (Solid State Drive), now up to 16 TB

1984 - NFS (Network File System), first NAS (Network Attached Storage) implementation

1987 - RAID (Redundant Array of Independent Disks), now up to ~100 disks

1993 - Disk Arrays, now up to ~200 disks

1994 - Fibre-channel, first SAN (Storage Area Network) implementation

2003 - GFS (Google File System), first Big Data implementation

Top Hadoop distributions

Apache Hadoop,
CDH (Cloudera Distribution including apache Hadoop),
HDP (Hortonworks Data Platform),
M3, M5 and M7,
Amazon Elastic MapReduce3,
BigInsights Enterprise Edition
Intel Distribution for Apache Hadoop

top 3 hadoop distributions top 5 hadoop distributions top hadoop distributions what is hadoop?

Post a Comment

Post a Comment

Rules ~

Media +

Respect Others Opinion. Commenting Link Is Strictly Forbidden. Comments According To The Posts Always Gets Priority.