Got it

Hadoop.

Created: Aug 1, 2022 09:42:38Latest reply: Aug 1, 2022 11:26:33 103 4 0 0 0
  HiCoins as reward: 0 (problem unresolved)

What are the core components of Hadoop.

  • x
  • convention:

Featured Answers
sachandio
Created Aug 1, 2022 09:45:28

Hi Dear Friend

Apache Hadoop is an open-source software framework for distributed storage and distributed processing of extremely large data sets.

There are basically 3 important core components of hadoop –

1. For computational processing i.e. MapReduce: MapReduce is the data processing layer of Hadoop. It is a software framework for easily writing applications that process the vast amount of structured and unstructured data stored in the Hadoop Distributed Filesystem (HSDF). It processes huge amount of data in parallel by dividing the job (submitted job) into a set of independent tasks (sub-job).
In Hadoop, MapReduce works by breaking the processing into phases: Map and Reduce. The Map is the first phase of processing, where we specify all the complex logic/business rules/costly code. Reduce is the second phase of processing, where we specify light-weight processing like aggregation/summation.

2. For storage purpose i.e., HDFS :Acronym of Hadoop Distributed File System – which is basic motive of storage. It also works as the Master-Slave pattern. In HDFS NameNode acts as a master which stores the metadata of data node and Data node acts as a slave which stores the actual data in local disc parallel.

3. Yarn : which is used for resource allocation.YARN is the processing framework in Hadoop, which provides Resource management, and it allows multiple data processing engines such as real-time streaming, data science and batch processing to handle data stored on a single platform.

These above are main 3 components of Hadoop.
View more
  • x
  • convention:

AVIONE
AVIONE Created Aug 1, 2022 19:33:06 (0) (0)
 
All Answers
sachandio
sachandio Created Aug 1, 2022 09:45:28

Hi Dear Friend

Apache Hadoop is an open-source software framework for distributed storage and distributed processing of extremely large data sets.

There are basically 3 important core components of hadoop –

1. For computational processing i.e. MapReduce: MapReduce is the data processing layer of Hadoop. It is a software framework for easily writing applications that process the vast amount of structured and unstructured data stored in the Hadoop Distributed Filesystem (HSDF). It processes huge amount of data in parallel by dividing the job (submitted job) into a set of independent tasks (sub-job).
In Hadoop, MapReduce works by breaking the processing into phases: Map and Reduce. The Map is the first phase of processing, where we specify all the complex logic/business rules/costly code. Reduce is the second phase of processing, where we specify light-weight processing like aggregation/summation.

2. For storage purpose i.e., HDFS :Acronym of Hadoop Distributed File System – which is basic motive of storage. It also works as the Master-Slave pattern. In HDFS NameNode acts as a master which stores the metadata of data node and Data node acts as a slave which stores the actual data in local disc parallel.

3. Yarn : which is used for resource allocation.YARN is the processing framework in Hadoop, which provides Resource management, and it allows multiple data processing engines such as real-time streaming, data science and batch processing to handle data stored on a single platform.

These above are main 3 components of Hadoop.
View more
  • x
  • convention:

AVIONE
AVIONE Created Aug 1, 2022 19:33:06 (0) (0)
 
faysalji
faysalji Moderator Author Created Aug 1, 2022 11:26:33

There are three components of Hadoop:

1) Hadoop HDFS - Hadoop Distributed File System (HDFS) is the storage unit.
Hadoop’s key storage system is HDFS. The extensive data is stored on HDFS. It is mainly devised for storing massive datasets in commodity hardware.

2) Hadoop MapReduce - Hadoop MapReduce is the processing unit.
The responsible layer of Hadoop for data processing is MapReduce. It puts a request for processing of structured and unstructured data which is already stored in HDFS. It is liable for the parallel processing of a high volume of data by distributing data into detached tasks. There are two stages of processing: Map and Reduce. In simple terms, Map is a stage where data blocks are read and made available to the executors (computers /nodes /containers) for processing. Reduce is a stage where all processed data is collected and collated.

3) Hadoop YARN - Yet Another Resource Negotiator (YARN) is a resource management unit.
The framework which is used to process in Hadoop is YARN. For resource management and to provide multiple data processing engines like real-time streaming, data science, and batch processing is done by YARN.
View more
  • x
  • convention:

AVIONE
AVIONE Created Aug 1, 2022 19:33:21 (0) (0)
 

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.
Information Protection Guide
Thanks for using Huawei Enterprise Support Community! We will help you learn how we collect, use, store and share your personal information and the rights you have in accordance with Privacy Policy and User Agreement.