Problem Symptom
During the upgrade process of FSM nodes, the progress is suspended.
Problem Diagnosis
1. Check whether the problem is caused by abnormal communication in the DeployManager module. That is, check whether the nodes can ping each other, and modules, such as console and cloudm, are running properly.
2. Check console log /var/log/deploy/console The log shows that the pipeline stops at 18:32 on March 1, while UpgradePerformServiceImpl : 498] [UPGRADE]AtomicTaskLog shows that the upgrade has been executed for a long period of time.
3. Run the /var/log/message command to check the system log of the FSM node. The time hopping occurs in the system.
Causes
The time hopping affects the upgrade process.
Solution
Scheme 1:
Change the system time to the time before time hopping.
Scheme 2:
Delete the celerybeat-schedule file: rm /var/log/servicetool/CloudAutoDeploy/sdk/celerybeat-schedule
Restart DeployManager: sh /opt/admin/servicetool/bin/servicetool.sh restart
Check After Recovery
After the time is reset, continue the upgrade.
Suggestion and Summary
During the upgrade problem locating process, in addition to the code error, system faults may occur as well. Although time hopping occurs infrequently, you cannot ignore this factor.
Applicable Versions
All