Hello everyone,
I believe you can learn how to deal with GaussDB Process on the Standby OMS Is Abnormal and the Build Operation Fails from my case.
[Applicable Versions]
6.5.x
[Symptom]
On FusionInsight Manager, ALM-12002 HA Resource Abnormal is reported.
Check on the OMS status indicates that the status of the GaussDB process on the standby OMS is abnormal. Specifically, ResStatus is always Repairing, indicating that the build operation is being performed.

3. Log in to the standby OMS node as user ommdba, and run the gs_ctl querybuild command. The command output indicates that the build operation starts from 0% again when the progress does not reach 100%.

[Impact and Severity]
There is no adverse impact on services.
[Estimated Restoration Time]
60 min
[Prerequisites]
Prepare the following items before the restoration.
Item | Operation |
|---|---|
Cluster account information | Apply for the password of cluster user admin. |
Node account information | Apply for the passwords of users omm and ommdba of cluster nodes. |
Secure Shell (SSH) remote login tool | Prepare tools such as PuTTY or SecureCRT. |
Except the GaussDB process on the standby OMS, other processes of the OMS nodes are normal.
[Fault Handling]
The build operation is to synchronize data from the active GaussDB to the standby GaussDB and load the data. Due to the synchronization timeout, the HA considers that the synchronization fails and performs the synchronization again. To resolve this problem, perform manual synchronization.
[Solution]
Stop the standby OMS.
Log in to the standby OMS node as user omm and run the following command:
sh /opt/huawei/Bigdata/om-0.0.1/sbin/stop-oms.sh
Stop the active OMS.
Log in to the active OMS node as user omm and run the following command:
sh /opt/huawei/Bigdata/om-0.0.1/sbin/stop-oms.sh
Package data on the active OMS.
Log in to the active OMS node as user ommdba and run the following command in the /srv/BigData/dbdata_om directory:
tar -zcf db.tar.gz db/
Generate the db.tar.gz file.
Copy the db.tar.gz file to the /srv/BigData/dbdata_om directory on the standby OMS node.
Log in to the standby OMS node as user ommdba, go to the /srv/BigData/dbdata_om directory, and run the following commands:
mv db db_bak
tar -zxf db.tar.gz
Ensure that the permission on folders in the /srv/BigData/dbdata_om/db directory and its lower-level directories on the standby node is 700, the permission on corresponding files is 600, and the user and owner group are ommdba:wheel.
Go to the /srv/BigData/dbdata_om/db directory and change the value of replconninfo1 in the postgresql.conf file.

In the value, localhost indicates the local IP address, and remotehost indicates the IP address of the peer OMS node.
Start the active OMS.
Log in to the active OMS node as user omm and run the following command:
sh /opt/huawei/Bigdata/om-0.0.1/sbin/start-oms.sh
Start the standby OMS.
Log in to the standby OMS node as user omm and run the following command:
sh /opt/huawei/Bigdata/om-0.0.1/sbin/start-oms.sh

One minute after the active OMS starts, start the standby OMS.
[Verification]
The alarm is cleared.
The status of the GaussDB processes on the active and standby OMS nodes is Normal.
Hope you can learn from it, thank you!
