In this post i will demonstrate what to do when you get alarm on the VCN "CMU offline"
【Problem Description】:CMU offline alarms.
【Problem Analysis】:
1.According to the alarm details,4 VCN whose IP ended with 37,38,42 and 43 repeatedly generated these alarms. 2.CMU is the module for managing clusters, checked the cluster status in on OMU portal and found that the cluster had expired, the cluster member status was unknown. l3.Analyzed the CMU logs and found there was zookeeper disconnection error.
【Root Cause】:Zookeeper disconnection error occurred, leading to cluster exception.
【Solution Description】:
1. Checked the time of the 4 VCN had been synced.
2. Logged into the 4 VCN servers and ran command: df -h to check the space usage: no partition was full.
3. Ran command: /home/ivstool/bin/service.shrestart cmu to restart the cmu process for all the 4 VCN.
4. Ran command:ps -ef|grepzookeeper to check the pid number of the zookeeper process on the 4 VCN.
5. Ran command: kill the pid number checked in step4 to kill and restart the zookeeper process on the 4VCN.
6. After the above operations, the cluster status showing on OMU portal changed to normal and the alarms on eSight were cleared.