Got it

Common Hadoop configuration parameter parsing – dfs.client.block.write.replace-datanode-on-failure.policy

Latest reply: Dec 23, 2021 09:45:22 3308 1 1 0 0

Hello, everyone!

The post will share with you the common Hadoop configuration parameter parsing-dfs.client.block.write.replace-datanode-on-failure.policy.

1. Official description


Default Value




If there is a data node/network failure in the write pipeline, DFSClient will try to remove the failed data node from the pipeline and then continue writing with the remaining data nodes. As a result, the number of data nodes in the pipeline is decreased. The feature is to add new data nodes to the pipeline. This is a site-wide property to enable/disable the feature. When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new data nodes for replacement. See also dfs.client.block.write.replace-datanode-on-failure.policy



This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true.

    ALWAYS: Always add a data node when an existing data node is removed.

    NEVER: Never add a data node.

    DEFAULT: Let r be the replication number.

Let n be the number of existing data nodes. Add a data node only if r is greater than or equal to 3 and either (1) floor(r/2) is greater than or equal to n; or (2) r is greater than n and the block is hflushed/appended.

2. Abnormal scenarios on the live network


As we all know, a pipeline concept is involved in the HDFS data writing model because HDFS writes three copies by default. When the three copies are written, a pipeline model is established. That is, if three data nodes A, B, and C are selected to store data to the target three copies, HDFS writes the first replica to A, then, the replica is transferred from the A to B node and then to C.

The two parameters in the table are related to the pipeline writing process. In the scenario where a pipeline writing process is performed on a client, how to handle the failure if the operation on one of the three selected nodes fails? dfs.client.block.write.replace-datanode-on-failure.policy.

The parameter dfs.client.block.write.replace-datanode-on-failure.policy is valid only when dfs.client.block.write.replace-datanode-on-failure.enable is set to true. The dfs.client.block.write.replace-datanode-on-failure.enable parameter has three values:

l   ALWAYS: when a data node encountering an exception during pipeline writing is deleted, a new data node is always added.

l   NEVER: new data nodes are never added.

l   DEFAULT: The number of copies is r and the number of data nodes is n.

A new data node can be added when one of the following conditions are met: r >= 3, floor(r/2)>=n, or r>n.floor (n) indicates that the value is rounded down.

3. Usage instructions

If the cluster scale is small (for example, three nodes or less are contained in the cluster), the cluster administrator may want to set the policy to NEVER or even disable the feature. Otherwise, pipeline writing error occurs frequently because the new data nodes cannot be found.

2. Common Hadoop Configuration Parameter Parsing – dfs.blockreport.intervalMsec

Background information

This parameter is derived from the verification of an open source trouble ticket. In the verification procedure, the following operations are performed:

run the hdfs dfs -setrep 1 /path command on the client and check whether the number of file copies in the path decreases to 1 and whether it is suspended.

However, after the preceding command is executed on the client, the command output is displayed after about one hour. According to the test engineer, she had been waiting for six hours. Then, the engineer checked whether parameters were used to control the waiting time. Details about the two related parameters are as follows:


Details about another related parameter are as follows:


The explanation is rough. The two parameters are neither found in the configuration description file hdfs-default.xml on the official website of the 3.0 version nor found in the source code.


These parameters are mainly used in the following HDFS scenarios: Restoration of damaged blocks, and deletion of redundant copies.

dfs.blockreport.intervalMsec specifies the interval of reporting full data block of data node to the NameNode. The default value is 6 hours. After the value of this parameter is changed to 1 minute, redundant copies are deleted.

dfs.datanode.directoryscan.interval check blocks of memory and disk data sets, and updates the information in the memory which is inconsistent with the information in the disk. The default time is 6 hours.

After a data block is damaged, the data node does not detect the damaged data block before performing disk check (the time is determined by dfs.datanode.directoryscan.interval). Before the data block information is reported to the NameNode (the interval is determined by dfs.datanode.directoryscan.interval), the data block is not recovered. The recovery measures are taken only when the NameNode receives the block information and finds that the block is damaged. The actual cluster environment is more complex as described in the background information. The waiting time is uncertain. However, it can be determined that setting this parameter to a small value accelerates data block recovery time.

3. Common Hadoop Configuration Parameter Parsing – dfs.heartbeat.interval

HDFS uses the master/slave architecture. The NameNode is the master, and the data node is the slave. As the central hub, the NameNode manages all data nodes. Data nodes report information about their data block storage to the NameNode periodically. If a data node does not report the information within a specified period, the NameNode determines that the data node is break down and does not allocate reading and writing requests to the data node. The formula for calculating the period is as follows:

Timeout interval = 2 x value of dfs.namenode.heartbeat.recheck-interval + 10 x value of dfs.heartbeat.interval

In the FusionInsight cluster, the default value of dfs.namenode.heartbeat.recheck-interval is 300,000 ms, and the default value of dfs.heartbeat.interval is 10s.

That is, the timeout interval for determining that a data node goes offline is 2 x 5 + 10 x 10/60 = 11 minutes and 40s.

If dfs.heartbeat.interval is not configured, the timeout interval is calculated using 3 x value of dfs.heartbeat.interval. That is, the default timeout duration of a data node is 10 minutes and 30s.

That's all, thanks!

  • x
  • convention:

Admin Created Dec 23, 2021 09:45:22

Thanks for your sharing!
View more
  • x
  • convention:


You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits


Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Please bind your phone number to obtain invitation bonus.
Information Protection Guide
Thanks for using Huawei Enterprise Support Community! We will help you learn how we collect, use, store and share your personal information and the rights you have in accordance with Privacy Policy and User Agreement.