Got it

HDFS Balancer

Latest reply: Jan 19, 2022 10:52:21 3899 7 6 0 0

Hello, everyone!

The post will share with you the HDFS balancer.

1. Usage

hdfs balancer

[-policy < policy>]

[-threshold < threshold>]

[-exclude [-f <hosts-file> | <comma-separated list of hosts>]]

[-include [-f <hosts-file> | <comma-separated list of hosts>]]

[-source [-f <hosts-file> | <comma-separated list of hosts>]]

[-blockpools < comma-separated list of blockpool ids>]

[-idleiterations < idleiterations>]


2. Parameter dfs.balance.bandwidthPerSec

This property determines the maximum speed at which a block will be moved from one data node to another. The default value is 1 MB per second. The higher the bandwidth, the faster a cluster can reach the balanced state, but with greater competition with application processes. If an administrator changes the value of this property in the configuration file, the change is observed when HDFS is next restarted.

3. The Balancer Automatically Exits When Any of the Following Five conditions Is Satisfied

  • The cluster is balanced;

  • No block can be moved;

  • No block has been moved for specified consecutive iterations (5 by default);

  • An IOException occurs while communicating with the namenode;

  • Another balancer is running.

4. Execution Process of Balancer

(1)  Perform balancer operations for all NameServices in each iteration.

(2)  Perform balancer operations on all NameServices in serial mode (each NameService corresponds to a balancer object).


for(NameNodeConnector nnc : connectors) {
   final Balancer b = new Balancer(nnc, p, conf);
   //Block Placement Policy With Node Tag initiation need init after balancer initiation.
   final Result r = b.runOneIteration();
   r.print(iteration, System.out);

(3)  Calculate the data volume required for migrating clusters smoothly.

(a) Calculate the cluster average space usage using the following formula:

AvgUtilization = Total space/Total capacity

(b) Calculate each storage type of each Data node (DN) using the following formulas:

  • Utilization: disk utilization

  • Capacity: disk capacity

  • Difference value between the utilization and the average utilization: UtilizationDiff = Utilization – AvgUtilization

  • Difference value between the absolute value of the difference value between the utilization and the average utilization and the threshold:

    thresholdDiff = Math.abs(utilizationDiff) - threshold;

  • Maximum data volume to be migrated for each storage type of each DN:

    utilizationDiff ≥ 0: maxSize2Move = min (utilizationDiff x Capacity, MAX_SIZE_TO_MOVE);

    utilizationDiff < 0: maxSize2Move = min [(-utilizationDiff) x Capacity, Remaining space capacity, MAX_SIZE_TO_MOVE];

Add each DN (including the storage type) to the following four sets based on the relationship between the usage, average usage, and threshold of each DN storage type.

  • overUtilized (higher than the upper limit): overLoadedBytes = thresholdDiff x Capacity.

  • overLoadedBytes = overLoadedBytes + thresholdDiff x Capacity

  • aboveAvgUtilized (higher than the average value, but lower than the upper limit)

  • belowAvgUtilized (lower than the average value, but higher than the lower limit)

  • underUtilized (lower than the upper limit): underLoadedBytes = thresholdDiff x Capacity

  • underLoadedBytes = underLoadedBytes + thresholdDiff x Capacity

Each set includes information about the DN and storage type. Therefore, each DN is added to a set for multiple times based on different storage types, and a DN may also be added to multiple sets.

(c) Returned data volume to be migrated smoothly in the cluster: max (overLoadedBytes, underLoadedBytes).

(4)  Calculate the total data volume to be migrated in this iteration based on the capacity information about DN.

(a)Select source DNs and destination DNs based on the racks to migrate data. Preferably select the same rack, and no special sequence requirements for other selections.

(b)Select source DNs and destination DNs based on the storage capacity of the DN to migrate data. The selection sequence is as follows:

overUtilized > underUtilized: Data migration from a DN higher the upper limit to a DN lower than the lower limit.

overUtilized > belowAvgUtilized: Data migration from a DN higher the upper limit to a DN higher than the lower limit and lower than the average value.

aboveAvgUtilized > underUtilized: Data migration from a DN lower than the lower limit but higher than average value to a DN lower than the lower limit.

(c) Each storage type corresponds to a storage group, which is used to record the maximum migration data amount

maxSize2Move in one iteration and the actual data amount scheduledSize to be migrated.

The process of migrating data between the source DNs and the destination DNs is as follows (When selecting a migration group from the source set and destination set, you need to check whether the storage types are the same.):

For (each source DN)
For (each destination DN)
The size of data to be migrated = min (src.maxSize2Move-src.scheduledSize, des.maxSize2Move-des.scheduledSize);
         src.scheduledSize += size;
         des.scheduledSize += size;
dispatcher.add (src, des); // Add the source DN and the destination DN to dispatcher and prepare for migration.
         if(des.maxSize2Move-des.scheduledSize == 0)
Remove the desDn from the destination DN set.
continue;// Continue to traverse the next desDn.
         if(src.maxSize2Move-src.scheduledSize == 0)
Remove the desDn from the destination DN set.
break; // Continue to traverse the next srcDn.

(d) Add scheduledSize of all source DNs in dispatcher together to obtain the total data volume to be migrated in this iteration.

(5)  For each pair of source and destination DNs, you need to start a thread to migrate data blocks between the source and destination DNs. When no data is moved in five iterations, you need to exit the balance.

(a) Twenty threads are started per second, which is restricted by BALANCER_NUM_RPC_PER_SEC.

(b) Select the transit DN in the following sequence:

* The source and destination DNs are the same node, and no transit is required.

* If the node group is supported, select the DN of the same node group.

* If the DN of the same rack exists, select the DN of the same rack.

* Select a DN that is not too busy.

(c) Check whether the block can be migrated (org.apache.hadoop.hdfs.server.balancer.Dispatcher#isGoodBlockCandidate).

* The block is not being migrated or is not in the migration queue.

* No copy of the block exists on the destination DN.

* Migrating the block does not reduce the number of racks where the block copy is located.

(d) Select the block to be migrated and the transit DN.

src.scheduledSize = blockSize

task.size = blockSize

(e) Start block migration (org.apache.hadoop.hdfs.server.balancer.Dispatcher.PendingMove#dispatch).

Create a socket connection with the destination DN (out for sending request and in for receiving messages).

* Use out to send block transfer request (including proxy source).

* Use in to receive transfer result.

Note: The rules for migrating balance data are as follows:

1. The balance data cannot be migrated between different storage types.

2. The number of copies cannot be reduced after balance data migration.

3. The number of racks where copies reside cannot be reduced after balance data migration.

That's all, thanks!

  • x
  • convention:

Admin Created Dec 18, 2019 07:40:06

HDFS Balancer
View more
  • x
  • convention:

user_4396693 Created Oct 16, 2021 05:38:40 (0) (0)
Admin Created Dec 20, 2019 07:27:39

Execution Process of Balancer
View more
  • x
  • convention:

user_4237671 Created Oct 16, 2021 04:53:58 (0) (0)
Created Oct 16, 2021 04:53:51

View more
  • x
  • convention:

Admin Created Jan 14, 2022 01:34:08

Very detailed.
View more
  • x
  • convention:

Created Jan 19, 2022 10:52:21

Good article! Thank you
View more
  • x
  • convention:


You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits


Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Please bind your phone number to obtain invitation bonus.
Information Protection Guide
Thanks for using Huawei Enterprise Support Community! We will help you learn how we collect, use, store and share your personal information and the rights you have in accordance with Privacy Policy and User Agreement.