SlideShare a Scribd company logo
What’s in it for you?
Back in the days when there was
no internet, data used
to be less and was often
structured
This data was easily stored on a
central sever storage
How Big Data evolved?
But then, internet boomed and
data grew at a very high rate
A lot of semi-structured and
unstructured data was being
generated
How Big Data evolved?
But then, internet boomed and
data grew at a very high rate
Storing such huge volumes of
data on a single server was not
an efficient way
How Big Data evolved?
But then, internet boomed and
data grew at a very high rate
There was a need for distributed
storage machines where data
could be stored and processed
parallelly
How Big Data evolved?
But then, internet boomed and
data grew at a very high rate
Data can be stored and
processed on multiple machines
How Big Data evolved?
But then, internet boomed and
data grew at a very high rate
Hadoop is a framework that allows
distributed storage and parallel processing
of big data
BIG DATA
TECHNOLOGIES
How Big Data evolved?
Solution
What’s in it for you?
HDFS Architecture
Components of Hadoop
Demo on MapReduce
What is Hadoop?
What is HDFS?
Hadoop MapReduce
Hadoop MapReduce Example
Hadoop YARN
What is Hadoop?
What is Hadoop?
What is Hadoop?
Hadoop is a framework that allows you to store large
volumes of data on several node machines
It also helps in processing the data in a parallel manner
1 TB
3 TB
Data
1 TB 1 TB
What is Hadoop?
Components of Hadoop
Components of Hadoop
Storing data
Cluster resource
management
Data processing
What is Hadoop?
What is HDFS?
What is HDFS?
Hadoop Distributed File System (HDFS) is the storage layer of Hadoop
that stores data in multiple data servers
Data is divided into multiple blocks
Stores them over multiple nodes of the cluster
What is HDFS?
Hadoop Distributed File System (HDFS) is the storage layer of Hadoop
that stores data in multiple data servers
Namenode
Secondary
Namenode
Slavenode
Master node contains
metadata in ram and disk
Has a copy of Namenode’s
metadata in disk
Contains the actual data in
the form of blocks
3 core components
What is Hadoop?
HDFS Blocks
HDFS Blocks
128 MB 128 MB 128 MB 128 MB 30 MB
542 MB
HDFS divides large data into different blocks Each block by default has 128 MB’s of data
Suppose, we have a 542 MB file
What is Hadoop?
Data Replication
Data Replication in HDFS
C
DN -------> Datanode
-------> Block AA
B -------> Block B
-------> Block C
-------> Block DD
DN 9
DN 10
DN 11
DN 12
Rack 1
Rack 3
C
B
B
D
DN 5
DN 6
DN 7
DN 8
Rack 1
Rack 2
A
B
A
C
DN 1
DN 2
DN 3
DN 4
Rack 1
Rack 1
A
C
D
D
Do you understand what’s
happening here?
Each block of data is being replicated thrice on different
datanodes present in different racks
Data Replication in HDFS
C
DN -------> Datanode
-------> Block AA
B -------> Block B
-------> Block C
-------> Block DD
DN 1
DN 2
DN 3
DN 4
Rack 1
Rack 1
A
C
D
D
Initial copy of Block A is created
in Rack 1
DN 5
DN 6
DN 7
DN 8
Rack 1
Rack 2
A
B
A
C
Initial copy of Block B is created
in Rack 2
DN 9
DN 10
DN 11
DN 12
Rack 1
Rack 3
C
B
B
D
Initial copy of Block C and D is
created in Rack 3
Two identical blocks cannot be placed on the same datanode
Data Replication in HDFS
C
DN -------> Datanode
-------> Block AA
B -------> Block B
-------> Block C
-------> Block DD
DN 1
DN 2
DN 3
DN 4
Rack 1
Rack 1
A
C
D
D
Initial copy of Block A is created
in Rack 1
DN 5
DN 6
DN 7
DN 8
Rack 1
Rack 2
A
B
A
C
Initial copy of Block B is created
in Rack 2
DN 9
DN 10
DN 11
DN 12
Rack 1
Rack 3
C
B
B
D
Initial copy of Block C and D is
created in Rack 3
When cluster is rack aware, all the replicas of a block will not be placed on the same
rack
Data Replication in HDFS
C
DN -------> Datanode
-------> Block AA
B -------> Block B
-------> Block C
-------> Block DD
DN 9
DN 10
DN 11
DN 12
Rack 1
DN 5
DN 6
DN 7
DN 8
Rack 1
Rack 2 Rack 3
CDN 1
DN 2
DN 3
DN 4
Rack 1
Rack 1
A
A
B
B
B
A
DC
C
D
D
Suppose, datanode 7
crashes
Data Replication in HDFS
C
DN -------> Datanode
-------> Block AA
B -------> Block B
-------> Block C
-------> Block DD
DN 9
DN 10
DN 11
DN 12
Rack 1
DN 5
DN 6
DN 7
DN 8
Rack 1
Rack 2 Rack 3
CDN 1
DN 2
DN 3
DN 4
Rack 1
Rack 1
A
A
B
B
B
A
DC
C
D
D
We will still have 2 copies of Block C data on DN 4 of Rack 1 and
DN 9 of Rack 3
Suppose, datanode 7
crashes
What is Hadoop?
HDFS Architecture
HDFS Architecture
Secondary
Namenode
Namenode
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode N
……….
Master
Slave
Metadata in Disk
Edit log Fsimage
Metadata in RAM
Metadata (Name, replicas,….):
/home/foo/data, 3, …
DN1: B1, B2
DN2: B1, B3
DN3: B2, B3
HDFS - Namenode
Namenode
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode N
……….
File.txt
Namenode is the master server. In a non high availability cluster, there can be only one
Namenode. In a Hadoop cluster, 2 Namenodes are possible
File system
Metadata in Disk
Edit log Fsimage
Metadata in RAM
Metadata (Name, replicas,….):
/home/foo/data, 3, …
HDFS - Namenode
Namenode
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode N
……….
File.txt
File system
Metadata in Disk
Edit log Fsimage
Metadata in RAM
Metadata (Name, replicas,….):
/home/foo/data, 3, …
Namenode holds metadata information about the various
Datanodes, their location, the size of each block, etc.
HDFS - Namenode
Namenode
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode N
……….
File.txt
File system
Metadata in Disk
Edit log Fsimage
Metadata in RAM
Metadata (Name, replicas,….):
/home/foo/data, 3, …
Helps to execute file system namespace operations –
opening, closing, renaming files and directories
HDFS - Namenode
Namenode
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode N
……….
Datanodes send block reports to Namenode every 10 seconds
File.txt
File system
Metadata in Disk
Edit log Fsimage
Metadata in RAM
Metadata (Name, replicas,….):
/home/foo/data, 3, …
HDFS - Datanode
Namenode
Datanode is a multiple instance server. There can be N number of
Datanode servers
Client
Metadata ops
Metadata (Name, replicas, ….):
/home/foo/data, 3, …
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode 4 Datanode 5
B4 B2 B4B3
Client
HDFS - Datanode
Namenode
Datanode stores and maintains the data blocks
Client
Metadata ops
Metadata (Name, replicas, ….):
/home/foo/data, 3, …
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode 4 Datanode 5
B4 B2 B4B3
Client
HDFS - Datanode
Namenode
Datanode stores and retrieves the blocks when asked by the Namenode
Client
Metadata ops
Metadata (Name, replicas, ….):
/home/foo/data, 3, …
Block ops
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
Datanode 4 Datanode 5
B4 B2 B4B3
Client
HDFS - Datanode
Namenode
Datanode 1
B1 B2
Datanode 2
B1 B3
Datanode 3
B2 B3
It reads and writes client’s request and performs block creation, deletion and
replication on instruction from the Namenode
Datanode 4 Datanode 5
B4 B2 B4B3
Client
Metadata ops
Metadata (Name, replicas, ….):
/home/foo/data, 3, …
Client
Block ops
Read
Write
Replication
Response from the Namenode
that the operation was
successful
HDFS – Secondary Namenode
Namenode
Datanode 1 Datanode 2 Datanode 3
Secondary Namenode server is responsible for maintaining a copy of
Metadata in disk
SecondaryN
amenode
Maintains
Metadata in Disk
Edit log Fsimage
Performs
checkpointing
HDFS Architecture
Namenode
MetaData (Name, replicas, ….):
/home/foo/data, 3, ….
Block ops
DatanodesDatanodes
Metadata ops
Client
Read
Rack 1 Rack 2
Replication
Client
Write Write
Hadoop Cluster – Rack Based Architecture
Hadoop Cluster
Core Switch Core Switch
Rack Switch
Node 1
Node 2
Node N
Rack Switch
Node 1
Node 2
Node N
Rack Switch
Node 1
Node 2
Node N
Rack 1 Rack NRack 2
What is Hadoop?
HDFS Read Mechanism
HDFS Read Mechanism
Namenode
4. Read
Datanodes
2. Get block locationsHDFS
Client
1. Open Distributed
FileSystem
FSData
InputStream
client JVM
client node
HDFS Read Mechanism
client JVM
client node
HDFS Client
Namenode
DN
1
DN 2
DN 3
DN 4
DN 5
DN 6
DN 7
DN 8
DN 9
Block A
Block A
Block ABlock B
Block B Block B
Request to read Block A and B
Sends the location of the blocks
(DN1 and DN2)
Block A Block B
Data to be read
Rack switch Rack switch Rack switch
Core switch
Rack 1 Rack 2 Rack 3
1
2
3
5
4
HDFS Read Mechanism
client JVM
client node
HDFS Client
Namenode
DN
1
DN 2
DN 3
DN 4
DN 5
DN 6
DN 7
DN 8
DN 9
Block A
Block A
Block ABlock B
Block B Block B
Request to read Block A and B
Sends the location of the blocks
(DN1 and DN2)
Block A Block B
Data to be read
Block A and B is read from DN1 and
DN2 as they are closest and have the
least network bandwidth
Rack switch Rack switch Rack switch
Core switch
Rack 1 Rack 2 Rack 3
1
2
5
7
6
What is Hadoop?
HDFS Write Mechanism
HDFS Write Mechanism
Namenode
4.1 Write packet
Datanodes
2. Create
1. CreateHDFS
Client
Distributed
FileSystem
FSData
OutputStream
client JVM
Pipeline of
datanodes
5.3 Acknowledgement
7. Complete
4.2
5.2
4.3
5.1
ack ack
client node
HDFS Write Mechanism
client JVM
client node
HDFS Client
Namenode
Rack switch
DN
1
DN 2
DN 3
Rack switch
DN 4
DN 5
DN 6
Rack switch
DN 7
DN 8
DN 9
Request to write data on Block A
Sends the location of the Datanodes
(DN1, DN6, DN8)
Block A
Data to be written
Core switch
Block A
Block A
Block A
Rack 1 Rack 2 Rack 3
1
2
3
4
HDFS Write Mechanism
client JVM
client node
HDFS Client
Namenode
Rack switch
DN
1
DN 2
DN 3
Rack switch
DN 4
DN 5
DN 6
Rack switch
DN 7
DN 8
DN 9
Request to write data on Block A
Sends the location of the Datanodes
(DN1, DN6, DN8)
Block A
Data to be written
Block A
replica 1
Block A
replica 2
Core switch
Block A
Block A
Block A
Rack 1 Rack 2 Rack 3
5 6
1
2
3
4
HDFS Write Mechanism
client JVM
client node
HDFS Client
Namenode
DN
1
DN 2
DN 3
Rack switch
DN 4
DN 5
DN 6
Rack switch
DN 7
DN 8
DN 9
Block A
Data to be written
Core switch
Ack
DN 1, DN 6, DN 8
Ack
DN 1
Ack
DN 8
Ack
DN 6
Write operation successful
Block A
Block A
Block A
Rack 1 Rack 2 Rack 3
7
8
9
10
11
Rack switch
What is Hadoop?
Hadoop MapReduce
Hadoop MapReduce
MapReduce is a framework that performs distributed and parallel processing of large
volumes of data
Map Reduce
Data
block
Read and
process
Generates key-value
pairs (key, value)
Shuffle
and sort
(K1, v1)
(k2, v2)
(k3, v3)
Receives key-
value pairs from
map jobs
Aggregate key-value
pairs into smaller sets
Hadoop MapReduce
MapReduce is a framework that performs distributed and parallel processing of large
volumes of data
Input Data Output Data
map()
map()
map()
Shuffle and
Sort
reduce()
reduce()
MapReduce Job Execution
Input data
stored on
HDFS
Input
Format
Shuffling
and sorting
Output
Format
Inputsplit
Inputsplit
Inputsplit
……
RecordReader
RecordReader
RecordReader
……
Combiner
Combiner
Combiner
……
Partitioner
Partitioner
Partitioner
……
Reducer
Reducer
……..
Mapper
Mapper
Mapper
……
Output data
stored on
HDFS
Input key
value pair
Intermediate
key value pair
Substitute
intermediate
key value pair
MapReduce Example
Big data comes in various
formats. This data can be
stored in multiple data servers
Big data comes in
various formats
This data can be
stored in multiple
data servers
Big, 1
data, 1
comes, 1
in, 1
various, 1
formats, 1
This, 1
data, 1
can, 1
be, 1
stored, 1
in, 1
multiple, 1
data, 1
servers, 1
Input Split Map
be, (1)
Big, (1)
be, (1)
can, (1)
data, (1,1)
comes, (1)
formats, (1)
in, (1,1)
multiple, (1)
servers, (1)
stored, (1)
This, (1)
various, (1)
Shuffle
be, (1)
Big, (1)
be, (1)
can, (1)
comes, (1)
data, (2)
formats, (1)
in, (2)
multiple, (1)
servers, (1)
stored, (1)
This, (1)
various, (1)
Reduce
What is Hadoop?
Hadoop YARN
Hadoop YARN
YARN ---------> Yet Another Resource Negotiator
Introduced in Hadoop 2.0 version
It is the middle layer between HDFS
and MapReduce
Manages cluster resources (memory,
network bandwidth, disk IO, CPU)
YARN Architecture
Resource
ManagerClient
Node
Manager
container
App Master
App Master
container
Node
Manager
Node
Manager
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
YARN Architecture – Resource Manager
Resource
ManagerClient
Node
Manager
container
App Master
App Master
container
Node
Manager
Node
Manager
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Resource Manager manages the resource
allocation in the cluster
YARN Architecture – Resource Manager
Resource
ManagerClient
Node
Manager
container
App Master
App Master
container
Node
Manager
Node
Manager
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Resource manager has 2 components:
Scheduler and Application Manager
Scheduler
Applications
Manager
YARN Architecture – Scheduler
Resource
ManagerClient
Node
Manager
container
App Master
App Master
container
Node
Manager
Node
Manager
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Scheduler
• Scheduler allocates resources to various
running applications
• Schedules resources based on the
requirements of the applications
• Does not monitor or track the status of the
applications
Applications
Manager
YARN Architecture – Applications Manager
Resource
ManagerClient
Node
Manager
container
App Master
App Master
container
Node
Manager
Node
Manager
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Scheduler
Applications
Manager
• Applications Manager accepts job
submissions
• Monitors and restarts application masters
in case of failure
YARN Architecture – Node Manager
Resource
ManagerClient
container
App Master
App Master
container
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Node
Manager
Node
Manager
Node
Manager
• Node Manager is a tracker that tracks the
jobs running
• Monitors each container’s resource
utilization
YARN Architecture – App Master
Resource
ManagerClient
container
App Master
App Master
container
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Node
Manager
Node
Manager
Node
Manager• Application Master manages resource
needs of individual applications
• Interacts with Scheduler to acquire
required resources
• Interacts with Node Manager to execute
and monitor tasks
YARN Architecture - Container
Resource
ManagerClient
container
App Master
App Master
container
container container
Job Submission
Node Status
MapReduce Status
Resource Request
Submit job
request
Node
Manager
Node
Manager
Node
Manager
• Container is a collection of resources
like RAM, CPU, Network Bandwidth
• Provides rights to an application to use
specific amount of resources
What is Hadoop?
Use case – Word Count
using MapReduce
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS Tutorial | Simplilearn

More Related Content

What's hot

Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
Bhavesh Padharia
 
Introduction to Pig
Introduction to PigIntroduction to Pig
Introduction to Pig
Prashanth Babu
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
Dr. C.V. Suresh Babu
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
Dataflair Web Services Pvt Ltd
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Simplilearn
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
Shubham Parmar
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Flavio Vit
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
DataArt
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
Design of Hadoop Distributed File System
Design of Hadoop Distributed File SystemDesign of Hadoop Distributed File System
Design of Hadoop Distributed File System
Dr. C.V. Suresh Babu
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
Apache Apex
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
tipanagiriharika
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
Arvind Kumar
 
Hadoop
HadoopHadoop
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Simplilearn
 

What's hot (20)

Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Introduction to Pig
Introduction to PigIntroduction to Pig
Introduction to Pig
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
 
Design of Hadoop Distributed File System
Design of Hadoop Distributed File SystemDesign of Hadoop Distributed File System
Design of Hadoop Distributed File System
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
 

Similar to Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS Tutorial | Simplilearn

Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Yahoo Developer Network
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
Mahendran Ponnusamy
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
Siddharth Mathur
 
Understanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the NetworkUnderstanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the Network
bradhedlund
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Simplilearn
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
ssuserec53e73
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
Jazan University
 
HDFS Design Principles
HDFS Design PrinciplesHDFS Design Principles
HDFS Design Principles
Konstantin V. Shvachko
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetup
vmoorthy
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
ssuserec53e73
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
Balaji Rajan
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
appaji intelhunt
 
Hadoop
HadoopHadoop
Hadoop
Ali Bahu
 
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File SystemFredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma
 
Hadoop and HDFS
Hadoop and HDFSHadoop and HDFS
Hadoop and HDFS
SatyaHadoop
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
Ben Stopford
 
Practical Hadoop using Pig
Practical Hadoop using PigPractical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
Kelly Technologies
 
Introduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxIntroduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptx
sunithachphd
 
3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt
3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt
3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt
gamer129
 

Similar to Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS Tutorial | Simplilearn (20)

Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay RadiaApache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
Apache Hadoop India Summit 2011 Keynote talk "HDFS Federation" by Sanjay Radia
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Understanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the NetworkUnderstanding Hadoop Clusters and the Network
Understanding Hadoop Clusters and the Network
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
HDFS Design Principles
HDFS Design PrinciplesHDFS Design Principles
HDFS Design Principles
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetup
 
HDFS.ppt
HDFS.pptHDFS.ppt
HDFS.ppt
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Hadoop
HadoopHadoop
Hadoop
 
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File SystemFredrick Ishengoma -  HDFS+- Erasure Coding Based Hadoop Distributed File System
Fredrick Ishengoma - HDFS+- Erasure Coding Based Hadoop Distributed File System
 
Hadoop and HDFS
Hadoop and HDFSHadoop and HDFS
Hadoop and HDFS
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Practical Hadoop using Pig
Practical Hadoop using PigPractical Hadoop using Pig
Practical Hadoop using Pig
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
 
Introduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxIntroduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptx
 
3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt
3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt
3 HDFS basicsaaaaaaaaaaaaaaaaaaaaaaaa.ppt
 

More from Simplilearn

Block Cipher Modes Of Operation | Computer Networking and Security | Simplilearn
Block Cipher Modes Of Operation | Computer Networking and Security | SimplilearnBlock Cipher Modes Of Operation | Computer Networking and Security | Simplilearn
Block Cipher Modes Of Operation | Computer Networking and Security | Simplilearn
Simplilearn
 
What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...
What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...
What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...
Simplilearn
 
How Much Do Ethical Hackers Make | Ethical Hacking Career|Simplilearn
How Much Do Ethical Hackers Make | Ethical Hacking Career|SimplilearnHow Much Do Ethical Hackers Make | Ethical Hacking Career|Simplilearn
How Much Do Ethical Hackers Make | Ethical Hacking Career|Simplilearn
Simplilearn
 
Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...
Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...
Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...
Simplilearn
 
How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...
How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...
How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...
Simplilearn
 
What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...
What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...
What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...
Simplilearn
 
Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...
Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...
Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...
Simplilearn
 
The WORST Beginner Cyber Security Mistakes Everyone Makes.pptx
The WORST Beginner Cyber Security Mistakes Everyone Makes.pptxThe WORST Beginner Cyber Security Mistakes Everyone Makes.pptx
The WORST Beginner Cyber Security Mistakes Everyone Makes.pptx
Simplilearn
 
Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...
Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...
Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...
Simplilearn
 
Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...
Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...
Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...
Simplilearn
 
7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...
7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...
7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...
Simplilearn
 
How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...
How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...
How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...
Simplilearn
 
How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...
How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...
How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...
Simplilearn
 
Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...
Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...
Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...
Simplilearn
 
Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...
Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...
Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...
Simplilearn
 
Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...
Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...
Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...
Simplilearn
 
Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...
Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...
Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...
Simplilearn
 
ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...
ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...
ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...
Simplilearn
 
Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...
Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...
Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...
Simplilearn
 
Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...
Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...
Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...
Simplilearn
 

More from Simplilearn (20)

Block Cipher Modes Of Operation | Computer Networking and Security | Simplilearn
Block Cipher Modes Of Operation | Computer Networking and Security | SimplilearnBlock Cipher Modes Of Operation | Computer Networking and Security | Simplilearn
Block Cipher Modes Of Operation | Computer Networking and Security | Simplilearn
 
What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...
What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...
What Is Default Gateway? | Default Gateway Explained In 9 Minutes | #Cybersec...
 
How Much Do Ethical Hackers Make | Ethical Hacking Career|Simplilearn
How Much Do Ethical Hackers Make | Ethical Hacking Career|SimplilearnHow Much Do Ethical Hackers Make | Ethical Hacking Career|Simplilearn
How Much Do Ethical Hackers Make | Ethical Hacking Career|Simplilearn
 
Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...
Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...
Is DevOps The Right Career Option To Choose In 2024? | Career Growth In DevOp...
 
How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...
How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...
How To Pass PMP Exam | Everything About PMP Exam | PMP Certification | Simpli...
 
What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...
What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...
What Is Cloud Security? | Cloud Security Fundamentals | Cloud Computing Tutor...
 
Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...
Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...
Java Spring Boot Roadmap | How To Master Spring Boot In 2024 | Spring Boot 20...
 
The WORST Beginner Cyber Security Mistakes Everyone Makes.pptx
The WORST Beginner Cyber Security Mistakes Everyone Makes.pptxThe WORST Beginner Cyber Security Mistakes Everyone Makes.pptx
The WORST Beginner Cyber Security Mistakes Everyone Makes.pptx
 
Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...
Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...
Machine Learning Interview Questions 2024 | ML Interview Questions And Answer...
 
Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...
Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...
Scrum Explained Under 20 Mins | What Is Scrum? | Scrum Master Training Tutori...
 
7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...
7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...
7 Best Data Science Jobs 2024 | Data Science Jobs and Salary | Data Science C...
 
How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...
How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...
How To Start Dropshipping In 2024 | What Is Dropshipping | Dropshipping For B...
 
How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...
How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...
How To Write An SEO Optimized Blog Post ? | SEO Optimized Blog Post | Simplil...
 
Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...
Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...
Which Cybersecurity Specialization Is Best In 2024? | Cybersecurity Careers 2...
 
Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...
Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...
Azure Roadmap 2024 | Azure Learning Path 2024 | Azure Career Guide 2024 | Sim...
 
Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...
Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...
Cybersecurity Roadmap 2024 | Cyber Security Career Roadmap For 2024 | Simplil...
 
Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...
Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...
Top 10 Business Analysis Tools | Business Analysis Tools And Techniques | Sim...
 
ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...
ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...
ITIL Roadmap 2023 | How To Get Certified In ITIL | ITIL V4 Foundation Trainin...
 
Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...
Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...
Top 5 Ethical Hacking Courses In India | 5 Best Ethical Hacking Courses in In...
 
Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...
Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...
Top 10 DevOps Jobs 2024 | 10 Highest Paying DevOps Jobs 2024 | DevOps Career ...
 

Recently uploaded

React Interview Question PDF By ScholarHat
React Interview Question PDF By ScholarHatReact Interview Question PDF By ScholarHat
React Interview Question PDF By ScholarHat
Scholarhat
 
Official MATATAG Weekly Lesson Log Format.pdf
Official MATATAG Weekly Lesson Log Format.pdfOfficial MATATAG Weekly Lesson Log Format.pdf
Official MATATAG Weekly Lesson Log Format.pdf
JaReah
 
C++ Interview Questions and Answers PDF By ScholarHat
C++ Interview Questions and Answers PDF By ScholarHatC++ Interview Questions and Answers PDF By ScholarHat
C++ Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Dreams Realised by mahadev desai 9 1.pptx
Dreams Realised by mahadev desai 9 1.pptxDreams Realised by mahadev desai 9 1.pptx
Dreams Realised by mahadev desai 9 1.pptx
AncyTEnglish
 
PRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdf
PRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdfPRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdf
PRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdf
nservice241
 
matatag classroom orientation school year 2024-2025
matatag classroom orientation school year 2024-2025matatag classroom orientation school year 2024-2025
matatag classroom orientation school year 2024-2025
florrizabombio
 
classroom orientation/ back to school...
classroom orientation/ back to school...classroom orientation/ back to school...
classroom orientation/ back to school...
RoselleRaguindin
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Email Marketing in Odoo 17 - Odoo 17 Slides
Email Marketing  in Odoo 17 - Odoo 17 SlidesEmail Marketing  in Odoo 17 - Odoo 17 Slides
Email Marketing in Odoo 17 - Odoo 17 Slides
Celine George
 
New features of Maintenance Module in Odoo 17
New features of Maintenance Module in Odoo 17New features of Maintenance Module in Odoo 17
New features of Maintenance Module in Odoo 17
Celine George
 
Plato and Aristotle's Views on Poetry by V.Jesinthal Mary
Plato and Aristotle's Views on Poetry  by V.Jesinthal MaryPlato and Aristotle's Views on Poetry  by V.Jesinthal Mary
Plato and Aristotle's Views on Poetry by V.Jesinthal Mary
jessintv
 
2024 Winter SWAYAM NPTEL & A Student.pptx
2024 Winter SWAYAM NPTEL & A Student.pptx2024 Winter SWAYAM NPTEL & A Student.pptx
2024 Winter SWAYAM NPTEL & A Student.pptx
Utsav Yagnik
 
How to define Related field in Odoo 17 - Odoo 17 Slides
How to define Related field in Odoo 17 - Odoo 17 SlidesHow to define Related field in Odoo 17 - Odoo 17 Slides
How to define Related field in Odoo 17 - Odoo 17 Slides
Celine George
 
Production Technology of Mango in Nepal.pptx
Production Technology of Mango in Nepal.pptxProduction Technology of Mango in Nepal.pptx
Production Technology of Mango in Nepal.pptx
UmeshTimilsina1
 
Lecture Notes Unit5 chapter 15 PL/SQL Programming
Lecture Notes Unit5 chapter 15 PL/SQL ProgrammingLecture Notes Unit5 chapter 15 PL/SQL Programming
Lecture Notes Unit5 chapter 15 PL/SQL Programming
Murugan146644
 
Microservices Interview Questions and Answers PDF By ScholarHat
Microservices Interview Questions and Answers PDF By ScholarHatMicroservices Interview Questions and Answers PDF By ScholarHat
Microservices Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Why study French Mackenzie Neale PowerPoint
Why study French Mackenzie Neale PowerPointWhy study French Mackenzie Neale PowerPoint
Why study French Mackenzie Neale PowerPoint
nealem1
 
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 SlidesHow to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
Celine George
 
Class 6 English Chapter 1 Fables and Folk Stories
Class 6 English Chapter 1 Fables and Folk StoriesClass 6 English Chapter 1 Fables and Folk Stories
Class 6 English Chapter 1 Fables and Folk Stories
sweetygupta8413
 

Recently uploaded (20)

React Interview Question PDF By ScholarHat
React Interview Question PDF By ScholarHatReact Interview Question PDF By ScholarHat
React Interview Question PDF By ScholarHat
 
Official MATATAG Weekly Lesson Log Format.pdf
Official MATATAG Weekly Lesson Log Format.pdfOfficial MATATAG Weekly Lesson Log Format.pdf
Official MATATAG Weekly Lesson Log Format.pdf
 
C++ Interview Questions and Answers PDF By ScholarHat
C++ Interview Questions and Answers PDF By ScholarHatC++ Interview Questions and Answers PDF By ScholarHat
C++ Interview Questions and Answers PDF By ScholarHat
 
Dreams Realised by mahadev desai 9 1.pptx
Dreams Realised by mahadev desai 9 1.pptxDreams Realised by mahadev desai 9 1.pptx
Dreams Realised by mahadev desai 9 1.pptx
 
PRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdf
PRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdfPRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdf
PRESS RELEASE - UNIVERSITY OF GHANA, JULY 16, 2024.pdf
 
matatag classroom orientation school year 2024-2025
matatag classroom orientation school year 2024-2025matatag classroom orientation school year 2024-2025
matatag classroom orientation school year 2024-2025
 
classroom orientation/ back to school...
classroom orientation/ back to school...classroom orientation/ back to school...
classroom orientation/ back to school...
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Email Marketing in Odoo 17 - Odoo 17 Slides
Email Marketing  in Odoo 17 - Odoo 17 SlidesEmail Marketing  in Odoo 17 - Odoo 17 Slides
Email Marketing in Odoo 17 - Odoo 17 Slides
 
New features of Maintenance Module in Odoo 17
New features of Maintenance Module in Odoo 17New features of Maintenance Module in Odoo 17
New features of Maintenance Module in Odoo 17
 
UM “ATÉ JÁ” ANIMADO! . .
UM “ATÉ JÁ” ANIMADO!                        .            .UM “ATÉ JÁ” ANIMADO!                        .            .
UM “ATÉ JÁ” ANIMADO! . .
 
Plato and Aristotle's Views on Poetry by V.Jesinthal Mary
Plato and Aristotle's Views on Poetry  by V.Jesinthal MaryPlato and Aristotle's Views on Poetry  by V.Jesinthal Mary
Plato and Aristotle's Views on Poetry by V.Jesinthal Mary
 
2024 Winter SWAYAM NPTEL & A Student.pptx
2024 Winter SWAYAM NPTEL & A Student.pptx2024 Winter SWAYAM NPTEL & A Student.pptx
2024 Winter SWAYAM NPTEL & A Student.pptx
 
How to define Related field in Odoo 17 - Odoo 17 Slides
How to define Related field in Odoo 17 - Odoo 17 SlidesHow to define Related field in Odoo 17 - Odoo 17 Slides
How to define Related field in Odoo 17 - Odoo 17 Slides
 
Production Technology of Mango in Nepal.pptx
Production Technology of Mango in Nepal.pptxProduction Technology of Mango in Nepal.pptx
Production Technology of Mango in Nepal.pptx
 
Lecture Notes Unit5 chapter 15 PL/SQL Programming
Lecture Notes Unit5 chapter 15 PL/SQL ProgrammingLecture Notes Unit5 chapter 15 PL/SQL Programming
Lecture Notes Unit5 chapter 15 PL/SQL Programming
 
Microservices Interview Questions and Answers PDF By ScholarHat
Microservices Interview Questions and Answers PDF By ScholarHatMicroservices Interview Questions and Answers PDF By ScholarHat
Microservices Interview Questions and Answers PDF By ScholarHat
 
Why study French Mackenzie Neale PowerPoint
Why study French Mackenzie Neale PowerPointWhy study French Mackenzie Neale PowerPoint
Why study French Mackenzie Neale PowerPoint
 
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 SlidesHow to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
How to Load Custom Field to POS in Odoo 17 - Odoo 17 Slides
 
Class 6 English Chapter 1 Fables and Folk Stories
Class 6 English Chapter 1 Fables and Folk StoriesClass 6 English Chapter 1 Fables and Folk Stories
Class 6 English Chapter 1 Fables and Folk Stories
 

Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS Tutorial | Simplilearn

  • 1. What’s in it for you?
  • 2. Back in the days when there was no internet, data used to be less and was often structured This data was easily stored on a central sever storage How Big Data evolved?
  • 3. But then, internet boomed and data grew at a very high rate A lot of semi-structured and unstructured data was being generated How Big Data evolved?
  • 4. But then, internet boomed and data grew at a very high rate Storing such huge volumes of data on a single server was not an efficient way How Big Data evolved?
  • 5. But then, internet boomed and data grew at a very high rate There was a need for distributed storage machines where data could be stored and processed parallelly How Big Data evolved?
  • 6. But then, internet boomed and data grew at a very high rate Data can be stored and processed on multiple machines How Big Data evolved?
  • 7. But then, internet boomed and data grew at a very high rate Hadoop is a framework that allows distributed storage and parallel processing of big data BIG DATA TECHNOLOGIES How Big Data evolved? Solution
  • 8. What’s in it for you? HDFS Architecture Components of Hadoop Demo on MapReduce What is Hadoop? What is HDFS? Hadoop MapReduce Hadoop MapReduce Example Hadoop YARN
  • 9. What is Hadoop? What is Hadoop?
  • 10. What is Hadoop? Hadoop is a framework that allows you to store large volumes of data on several node machines It also helps in processing the data in a parallel manner 1 TB 3 TB Data 1 TB 1 TB
  • 12. Components of Hadoop Storing data Cluster resource management Data processing
  • 14. What is HDFS? Hadoop Distributed File System (HDFS) is the storage layer of Hadoop that stores data in multiple data servers Data is divided into multiple blocks Stores them over multiple nodes of the cluster
  • 15. What is HDFS? Hadoop Distributed File System (HDFS) is the storage layer of Hadoop that stores data in multiple data servers Namenode Secondary Namenode Slavenode Master node contains metadata in ram and disk Has a copy of Namenode’s metadata in disk Contains the actual data in the form of blocks 3 core components
  • 17. HDFS Blocks 128 MB 128 MB 128 MB 128 MB 30 MB 542 MB HDFS divides large data into different blocks Each block by default has 128 MB’s of data Suppose, we have a 542 MB file
  • 18. What is Hadoop? Data Replication
  • 19. Data Replication in HDFS C DN -------> Datanode -------> Block AA B -------> Block B -------> Block C -------> Block DD DN 9 DN 10 DN 11 DN 12 Rack 1 Rack 3 C B B D DN 5 DN 6 DN 7 DN 8 Rack 1 Rack 2 A B A C DN 1 DN 2 DN 3 DN 4 Rack 1 Rack 1 A C D D Do you understand what’s happening here? Each block of data is being replicated thrice on different datanodes present in different racks
  • 20. Data Replication in HDFS C DN -------> Datanode -------> Block AA B -------> Block B -------> Block C -------> Block DD DN 1 DN 2 DN 3 DN 4 Rack 1 Rack 1 A C D D Initial copy of Block A is created in Rack 1 DN 5 DN 6 DN 7 DN 8 Rack 1 Rack 2 A B A C Initial copy of Block B is created in Rack 2 DN 9 DN 10 DN 11 DN 12 Rack 1 Rack 3 C B B D Initial copy of Block C and D is created in Rack 3 Two identical blocks cannot be placed on the same datanode
  • 21. Data Replication in HDFS C DN -------> Datanode -------> Block AA B -------> Block B -------> Block C -------> Block DD DN 1 DN 2 DN 3 DN 4 Rack 1 Rack 1 A C D D Initial copy of Block A is created in Rack 1 DN 5 DN 6 DN 7 DN 8 Rack 1 Rack 2 A B A C Initial copy of Block B is created in Rack 2 DN 9 DN 10 DN 11 DN 12 Rack 1 Rack 3 C B B D Initial copy of Block C and D is created in Rack 3 When cluster is rack aware, all the replicas of a block will not be placed on the same rack
  • 22. Data Replication in HDFS C DN -------> Datanode -------> Block AA B -------> Block B -------> Block C -------> Block DD DN 9 DN 10 DN 11 DN 12 Rack 1 DN 5 DN 6 DN 7 DN 8 Rack 1 Rack 2 Rack 3 CDN 1 DN 2 DN 3 DN 4 Rack 1 Rack 1 A A B B B A DC C D D Suppose, datanode 7 crashes
  • 23. Data Replication in HDFS C DN -------> Datanode -------> Block AA B -------> Block B -------> Block C -------> Block DD DN 9 DN 10 DN 11 DN 12 Rack 1 DN 5 DN 6 DN 7 DN 8 Rack 1 Rack 2 Rack 3 CDN 1 DN 2 DN 3 DN 4 Rack 1 Rack 1 A A B B B A DC C D D We will still have 2 copies of Block C data on DN 4 of Rack 1 and DN 9 of Rack 3 Suppose, datanode 7 crashes
  • 24. What is Hadoop? HDFS Architecture
  • 25. HDFS Architecture Secondary Namenode Namenode Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode N ………. Master Slave Metadata in Disk Edit log Fsimage Metadata in RAM Metadata (Name, replicas,….): /home/foo/data, 3, … DN1: B1, B2 DN2: B1, B3 DN3: B2, B3
  • 26. HDFS - Namenode Namenode Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode N ………. File.txt Namenode is the master server. In a non high availability cluster, there can be only one Namenode. In a Hadoop cluster, 2 Namenodes are possible File system Metadata in Disk Edit log Fsimage Metadata in RAM Metadata (Name, replicas,….): /home/foo/data, 3, …
  • 27. HDFS - Namenode Namenode Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode N ………. File.txt File system Metadata in Disk Edit log Fsimage Metadata in RAM Metadata (Name, replicas,….): /home/foo/data, 3, … Namenode holds metadata information about the various Datanodes, their location, the size of each block, etc.
  • 28. HDFS - Namenode Namenode Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode N ………. File.txt File system Metadata in Disk Edit log Fsimage Metadata in RAM Metadata (Name, replicas,….): /home/foo/data, 3, … Helps to execute file system namespace operations – opening, closing, renaming files and directories
  • 29. HDFS - Namenode Namenode Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode N ………. Datanodes send block reports to Namenode every 10 seconds File.txt File system Metadata in Disk Edit log Fsimage Metadata in RAM Metadata (Name, replicas,….): /home/foo/data, 3, …
  • 30. HDFS - Datanode Namenode Datanode is a multiple instance server. There can be N number of Datanode servers Client Metadata ops Metadata (Name, replicas, ….): /home/foo/data, 3, … Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode 4 Datanode 5 B4 B2 B4B3 Client
  • 31. HDFS - Datanode Namenode Datanode stores and maintains the data blocks Client Metadata ops Metadata (Name, replicas, ….): /home/foo/data, 3, … Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode 4 Datanode 5 B4 B2 B4B3 Client
  • 32. HDFS - Datanode Namenode Datanode stores and retrieves the blocks when asked by the Namenode Client Metadata ops Metadata (Name, replicas, ….): /home/foo/data, 3, … Block ops Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 Datanode 4 Datanode 5 B4 B2 B4B3 Client
  • 33. HDFS - Datanode Namenode Datanode 1 B1 B2 Datanode 2 B1 B3 Datanode 3 B2 B3 It reads and writes client’s request and performs block creation, deletion and replication on instruction from the Namenode Datanode 4 Datanode 5 B4 B2 B4B3 Client Metadata ops Metadata (Name, replicas, ….): /home/foo/data, 3, … Client Block ops Read Write Replication Response from the Namenode that the operation was successful
  • 34. HDFS – Secondary Namenode Namenode Datanode 1 Datanode 2 Datanode 3 Secondary Namenode server is responsible for maintaining a copy of Metadata in disk SecondaryN amenode Maintains Metadata in Disk Edit log Fsimage Performs checkpointing
  • 35. HDFS Architecture Namenode MetaData (Name, replicas, ….): /home/foo/data, 3, …. Block ops DatanodesDatanodes Metadata ops Client Read Rack 1 Rack 2 Replication Client Write Write
  • 36. Hadoop Cluster – Rack Based Architecture Hadoop Cluster Core Switch Core Switch Rack Switch Node 1 Node 2 Node N Rack Switch Node 1 Node 2 Node N Rack Switch Node 1 Node 2 Node N Rack 1 Rack NRack 2
  • 37. What is Hadoop? HDFS Read Mechanism
  • 38. HDFS Read Mechanism Namenode 4. Read Datanodes 2. Get block locationsHDFS Client 1. Open Distributed FileSystem FSData InputStream client JVM client node
  • 39. HDFS Read Mechanism client JVM client node HDFS Client Namenode DN 1 DN 2 DN 3 DN 4 DN 5 DN 6 DN 7 DN 8 DN 9 Block A Block A Block ABlock B Block B Block B Request to read Block A and B Sends the location of the blocks (DN1 and DN2) Block A Block B Data to be read Rack switch Rack switch Rack switch Core switch Rack 1 Rack 2 Rack 3 1 2 3 5 4
  • 40. HDFS Read Mechanism client JVM client node HDFS Client Namenode DN 1 DN 2 DN 3 DN 4 DN 5 DN 6 DN 7 DN 8 DN 9 Block A Block A Block ABlock B Block B Block B Request to read Block A and B Sends the location of the blocks (DN1 and DN2) Block A Block B Data to be read Block A and B is read from DN1 and DN2 as they are closest and have the least network bandwidth Rack switch Rack switch Rack switch Core switch Rack 1 Rack 2 Rack 3 1 2 5 7 6
  • 41. What is Hadoop? HDFS Write Mechanism
  • 42. HDFS Write Mechanism Namenode 4.1 Write packet Datanodes 2. Create 1. CreateHDFS Client Distributed FileSystem FSData OutputStream client JVM Pipeline of datanodes 5.3 Acknowledgement 7. Complete 4.2 5.2 4.3 5.1 ack ack client node
  • 43. HDFS Write Mechanism client JVM client node HDFS Client Namenode Rack switch DN 1 DN 2 DN 3 Rack switch DN 4 DN 5 DN 6 Rack switch DN 7 DN 8 DN 9 Request to write data on Block A Sends the location of the Datanodes (DN1, DN6, DN8) Block A Data to be written Core switch Block A Block A Block A Rack 1 Rack 2 Rack 3 1 2 3 4
  • 44. HDFS Write Mechanism client JVM client node HDFS Client Namenode Rack switch DN 1 DN 2 DN 3 Rack switch DN 4 DN 5 DN 6 Rack switch DN 7 DN 8 DN 9 Request to write data on Block A Sends the location of the Datanodes (DN1, DN6, DN8) Block A Data to be written Block A replica 1 Block A replica 2 Core switch Block A Block A Block A Rack 1 Rack 2 Rack 3 5 6 1 2 3 4
  • 45. HDFS Write Mechanism client JVM client node HDFS Client Namenode DN 1 DN 2 DN 3 Rack switch DN 4 DN 5 DN 6 Rack switch DN 7 DN 8 DN 9 Block A Data to be written Core switch Ack DN 1, DN 6, DN 8 Ack DN 1 Ack DN 8 Ack DN 6 Write operation successful Block A Block A Block A Rack 1 Rack 2 Rack 3 7 8 9 10 11 Rack switch
  • 47. Hadoop MapReduce MapReduce is a framework that performs distributed and parallel processing of large volumes of data Map Reduce Data block Read and process Generates key-value pairs (key, value) Shuffle and sort (K1, v1) (k2, v2) (k3, v3) Receives key- value pairs from map jobs Aggregate key-value pairs into smaller sets
  • 48. Hadoop MapReduce MapReduce is a framework that performs distributed and parallel processing of large volumes of data Input Data Output Data map() map() map() Shuffle and Sort reduce() reduce()
  • 49. MapReduce Job Execution Input data stored on HDFS Input Format Shuffling and sorting Output Format Inputsplit Inputsplit Inputsplit …… RecordReader RecordReader RecordReader …… Combiner Combiner Combiner …… Partitioner Partitioner Partitioner …… Reducer Reducer …….. Mapper Mapper Mapper …… Output data stored on HDFS Input key value pair Intermediate key value pair Substitute intermediate key value pair
  • 50. MapReduce Example Big data comes in various formats. This data can be stored in multiple data servers Big data comes in various formats This data can be stored in multiple data servers Big, 1 data, 1 comes, 1 in, 1 various, 1 formats, 1 This, 1 data, 1 can, 1 be, 1 stored, 1 in, 1 multiple, 1 data, 1 servers, 1 Input Split Map be, (1) Big, (1) be, (1) can, (1) data, (1,1) comes, (1) formats, (1) in, (1,1) multiple, (1) servers, (1) stored, (1) This, (1) various, (1) Shuffle be, (1) Big, (1) be, (1) can, (1) comes, (1) data, (2) formats, (1) in, (2) multiple, (1) servers, (1) stored, (1) This, (1) various, (1) Reduce
  • 52. Hadoop YARN YARN ---------> Yet Another Resource Negotiator Introduced in Hadoop 2.0 version It is the middle layer between HDFS and MapReduce Manages cluster resources (memory, network bandwidth, disk IO, CPU)
  • 53. YARN Architecture Resource ManagerClient Node Manager container App Master App Master container Node Manager Node Manager container container Job Submission Node Status MapReduce Status Resource Request Submit job request
  • 54. YARN Architecture – Resource Manager Resource ManagerClient Node Manager container App Master App Master container Node Manager Node Manager container container Job Submission Node Status MapReduce Status Resource Request Submit job request Resource Manager manages the resource allocation in the cluster
  • 55. YARN Architecture – Resource Manager Resource ManagerClient Node Manager container App Master App Master container Node Manager Node Manager container container Job Submission Node Status MapReduce Status Resource Request Submit job request Resource manager has 2 components: Scheduler and Application Manager Scheduler Applications Manager
  • 56. YARN Architecture – Scheduler Resource ManagerClient Node Manager container App Master App Master container Node Manager Node Manager container container Job Submission Node Status MapReduce Status Resource Request Submit job request Scheduler • Scheduler allocates resources to various running applications • Schedules resources based on the requirements of the applications • Does not monitor or track the status of the applications Applications Manager
  • 57. YARN Architecture – Applications Manager Resource ManagerClient Node Manager container App Master App Master container Node Manager Node Manager container container Job Submission Node Status MapReduce Status Resource Request Submit job request Scheduler Applications Manager • Applications Manager accepts job submissions • Monitors and restarts application masters in case of failure
  • 58. YARN Architecture – Node Manager Resource ManagerClient container App Master App Master container container container Job Submission Node Status MapReduce Status Resource Request Submit job request Node Manager Node Manager Node Manager • Node Manager is a tracker that tracks the jobs running • Monitors each container’s resource utilization
  • 59. YARN Architecture – App Master Resource ManagerClient container App Master App Master container container container Job Submission Node Status MapReduce Status Resource Request Submit job request Node Manager Node Manager Node Manager• Application Master manages resource needs of individual applications • Interacts with Scheduler to acquire required resources • Interacts with Node Manager to execute and monitor tasks
  • 60. YARN Architecture - Container Resource ManagerClient container App Master App Master container container container Job Submission Node Status MapReduce Status Resource Request Submit job request Node Manager Node Manager Node Manager • Container is a collection of resources like RAM, CPU, Network Bandwidth • Provides rights to an application to use specific amount of resources
  • 61. What is Hadoop? Use case – Word Count using MapReduce

Editor's Notes

  1. Style - 01