Planning HDFS Capacity

Latest reply: Aug 27, 2019 07:00:32 673 5 2 0

In an HDFS, the DataNode stores user files and directories as blocks, and generate file objects on the NameNode to map each file, directory, and block on the DataNode.

The file objects on NameNodes require certain memory capacity. The memory consumption linearly increases as more file objects generated. The number of file objects on the NameNode increases and consume more memory when the files and directories stored on the DataNode increase. In this case, the existing hardware cannot meet the service requirement and the cluster is difficult to be expanded.

Plan the capacity of the HDFS that stores a large number of files is to plan the capacity specifications of the NameNode and DataNode and set parameters according to the capacity plan.

Capacity Specifications

l   NameNode capacity specifications

Each file object on the NameNode corresponds to a file, directory, or block on the DataNode.

A file uses at least one block. The default size of a block is 134217728, that is, 128 MB, which can be set in the dfs.blocksize parameter. By default, if the size of a file is less than 128 MB, the file uses only one block; if the size of a file is greater than 128 MB, the number of blocks used by the file is calculated by the file size divided by 128 MB. The directories do not occupy blocks.

Based on dfs.blocksize, the number of file objects on the NameNode is calculated as follows:

Table 1-1 Number of NameNode file objects

Size of a File

Number of File Objects

Less than 128 MB

1 (File) + 1 (Block) = 2

Greater than 128 MB (for example, 128 GB)

1 (File) + 1024 (128 GB ÷ 128 MB = 1024 Blocks) = 1025

 

The maximum number of file objects supported by the active and standby NameNodes is 300,000,000 (equivalent to 150,000,000 small files). dfs.namenode.max.objects specifies the number of file objects that can be generated in the system. The default value is 0, which indicates that the number is not limited.

l   DataNode capacity specifications

In the HDFS, blocks are stored on the DataNode as replicas. The default number of replicas is 3, which can be set in the dfs.replication parameter.

The number of blocks stored on all DataNode role instances in the cluster is: HDFS Block x 3. The average number of blocks stored in each DataNode role instance in the cluster is: HDFS Block x 3 ÷ Number of DataNodes.

Table 1-2 DataNode specifications

Item

Specifications

Maximum number of block replicas supported by a DataNode instance

5,000,000

Maximum number of block replicas supported by a disk of a DataNode instance

500,000

Minimum number of disks required when the number of block replicas supported by a DataNode instance reaches the maximum

10

 

Table 1-3 Number of DataNode nodes

Item

Specifications

Maximum number of block replicas supported by a DataNode instance

5,000,000

Maximum number of block replicas supported by a disk of a DataNode instance

500,000

Minimum number of disks required when the number of block replicas supported by a DataNode instance reaches the maximum

10

 

Viewing the HDFS Capacity Status

l   NameNode information

Log in to FusionInsight Manager and choose Services > HDFS > NameNode(Active). Click Overview and check the number of file objects, files, directories, or blocks in HDFS in Summary.

l   DataNode information

Log in to FusionInsight Manager and choose Services > HDFS > NameNode(Active). Click each DataNode and check the number of blocks of all DataNodes with alarms.

 


This post was last edited by chz at 2018-06-21 03:40.
  • x
  • convention:

II
Created May 16, 2018 02:24:41 Helpful(0) Helpful(0)

Useful information Planning HDFS Capacity-2665125-1
  • x
  • convention:

sze_van
Created May 16, 2018 09:22:47 Helpful(0) Helpful(0)

Thank for share:)
  • x
  • convention:

jino94
Created May 17, 2018 05:58:38 Helpful(0) Helpful(0)

Great infoPlanning HDFS Capacity-2666065-1
  • x
  • convention:

user_2921761
Created Jun 27, 2018 01:40:44 Helpful(0) Helpful(0)

good
  • x
  • convention:

little_fish
Admin Created Aug 27, 2019 07:00:32 Helpful(0) Helpful(0)

  • x
  • convention:

Comment

Reply
You need to log in to reply to the post Login | Register

Notice Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " Privacy."
If the attachment button is not available, update the Adobe Flash Player to the latest version!

My Followers

Login and enjoy all the member benefits

Login and enjoy all the member benefits

Login