Got it

Automatic CloudA Upgrade Fails Due to a Node Communication Failure Before the Upgrade

124 0 0 0 0

Problem Symptom

Before an automatic CloudA upgrade, node communications are interrupted. As a result, the pre-upgrade check of the Agent status is not passed and the upgrade fails.

[ERROR] agents status are not normal!

clouda upgrade result is :1

CloudA automatic upgrade log path: /var/log/deploy/scripts/package.log

2_en-us_image_0210353585.png


Problem Diagnosis

1. Log in to any FSM node and switch to user root.

2. Run the following commands to export the node information:

# touch /tmp/sql_result.txt

# exec_sql="/opt/fusionstorage/deploymanager/gaussdb/app/bin/gsql -p 7018 -d cmdb -W Huawei12#$ -c \\\"select * from HOST_INFO;\\\""

# eval "su - dmdbadmin -c \"${exec_sql}\"" >> /tmp/sql_result.txt

2_en-us_image_0211004789.png

3. Run the following command to view the exported node information and obtain information about the node where the CloudA Agent is abnormal (the value of AGENT_STATUS is 3):

1_en-us_image_0237240175.png

# vim /tmp/sql_result.txt

4. Check whether the network communications between the faulty node and FSM nodes are normal.


Causes

Before the automatic CloudA upgrade, the Agent status of the node is abnormal. As a result, CloudA on the node cannot be automatically upgraded.


Solution

  1. Rectify the faulty node.

  2. Log in to the active FSM node and run the following commands to check whether the Agent status of the faulty node is normal (the value of AGENT_STATUS is 1):

    1_en-us_image_0237240177.png

    # rm -rf  /tmp/sql_result.txt

    # touch /tmp/sql_result.txt

    # exec_sql="/opt/fusionstorage/deploymanager/gaussdb/app/bin/gsql -p 7018 -d cmdb -W Huawei12#$ -c \\\"select * from HOST_INFO;\\\""

    # eval "su - dmdbadmin -c \"${exec_sql}\"" >> /tmp/sql_result.txt

    # vim /tmp/sql_result.txt

  3. After the node is restored, go to the directory where DeployManager is upgraded automatically and run the sh upgrade.sh command to upgrade CloudA again.


Check After Recovery

Log in to any FSM node and run the following commands to check AGENT_STATUS of the faulty node. If the node recovers, its AGENT_STATUS is 1.

1_en-us_image_0237240176.png

# rm -rf  /tmp/sql_result.txt

# touch /tmp/sql_result.txt

# exec_sql="/opt/fusionstorage/deploymanager/gaussdb/app/bin/gsql -p 7018 -d cmdb -W Huawei12#$ -c \\\"select * from HOST_INFO;\\\""

# eval "su - dmdbadmin -c \"${exec_sql}\"" >> /tmp/sql_result.txt

# vim /tmp/sql_result.txt


Suggestion and Summary

N/A


Applicable Versions

All


Comment

You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.