Big Data and Hadoop evolution

3:29:00 PM 1:03:05 PM

2003

October: Google publishes the Google File System (GFS) paper, introducing a scalable distributed file system designed for large data-intensive applications.

2004

December: Google releases the MapReduce paper, detailing a programming model for processing large datasets with a distributed algorithm on a cluster.

2005

April: Doug Cutting and Mike Cafarella initiate the Hadoop project, inspired by Google's GFS and MapReduce papers.

2006

April 1: The first official release of Apache Hadoop (version 0.1.0) is made available.
November: Google introduces BigTable, a distributed storage system for managing structured data.

2008

February 19: Yahoo! launches the world's largest Hadoop production application, the Yahoo! Search Webmap, running on a Linux cluster with over 10,000 cores.

2010

May: Facebook announces that it has the largest Hadoop cluster in the world, storing 21 petabytes of data.
June: Google publishes the Dremel paper, introducing an interactive analysis system for large datasets.

2012

June: Facebook's Hadoop cluster grows to 100 petabytes, highlighting the scalability of Hadoop.
October: Apache Hadoop YARN (Yet Another Resource Negotiator) is introduced, enhancing resource management and job scheduling capabilities.

2013

March: Cloudera releases Impala, a high-performance, low-latency SQL query engine for Hadoop.
July: Apache Tez is released, providing a framework for building high-performance batch and interactive data processing applications.

2014

March: Cloudera signs cooperation agreements with Teradata and Red Hat to enhance big data solutions.

2015

June: Apache Spark becomes a top-level project, offering a fast and general-purpose cluster-computing system compatible with Hadoop data.

2016

March: The Apache Software Foundation announces the availability of Apache Hadoop v2.7.3, featuring significant enhancements and bug fixes.

2017

December: Apache Hadoop 3.0.0 is released, introducing support for erasure coding, improvements to HDFS, and other major features.

2018

April: Apache Hadoop 3.1.0 is released, bringing enhancements in scalability and storage efficiency.

2019

January: Apache Hadoop 3.2.0 is released, providing new features and improvements for better performance and stability.

2020

July: Apache Hadoop 3.3.0 is released, offering support for new features and optimizations.

2021

August: Apache Hadoop 3.3.1 is released, including various bug fixes and enhancements.

2022

July: Apache Hadoop 3.3.3 is released, providing stability improvements and minor feature updates.

2023

June: Apache Hadoop 3.3.6 is released, featuring critical security updates and performance enhancements.

2024

March: Apache Hadoop 3.4.0 is released, introducing new features and improvements for modern data processing needs.

This timeline highlights the significant milestones in the development and evolution of Big Data and Hadoop technologies, reflecting their growth and adaptation to the ever-changing data landscape.

Hadoop

Hadoop Quiz

Big Data and Hadoop evolution

Post a Comment