Abnormal Synchronization Between the Active and Standby VRM Databases Highlighted

294 0 4 0

Symptom

A critical alarm is generated, indicating that the synchronization between the active and standby VRM databases is abnormal. This alarm is not automatically cleared.

Analysis

1.         Run the service had query command to check the status of processes on the active and standby VRMs. All processes on the active VRM are normal, and the GaussDB process on the standby VRM is abnormal.

 143734n3wmbf5nukwzjaw2.png?图片.png

The following figure shows the normal status of the VRM GaussDB process.

 143741z743nt8vt70mrt98.png?图片.png

2.         Run the service gaussdb query command on the standby VRM to check the database synchronization status. The postmaster process does not exist, that is, the GaussDB service is not started.

 144215d3f83co3kw9638wf.png?图片.png

3.         View database logs. You can view the latest log file gaussdb-2019-**.log in /var/log/operationlog/gaussdb. The following error information is found: invalid error record, xlog redo...

 144223bgut5nncngctmlu1.png?图片.png

This error occurs because the content of the configuration file pg_dataxlogchk in the /opt/gaussdb/data/global directory on the standby VRM is changed due to abnormal power-off. You need to copy the configuration file on the active VRM to the standby VRM.

4.         Continue to check whether the PG_VERSION and postgresl.conf configuration files exist in the /opt/gaussdb/data directory on the standby VRM. If they do not exist, copy them from the active VRM. In addition, ensure that the content of the PG_VERSION is the same as that on the active VRM. Then, run the su - postgres command on the standby VRM to switch to the database user and run the gs_ctl build command, synchronize the data on the active VRM to the standby VRM. The following figure shows the normal execution result.

 144236n1a0culqmcqd04ua.png?图片.png

5.         If an error (for example, could not connect to server) occurs when the gs_ctl build command is executed, check the gs_ctl-current file in /var/log/operationlog/gaussdb. The detailed error information is warning: could not create Ha listen socket for "192.168.40.5". In the error information, the IP address is not the current IP address of the standby VRM but is its initial IP address (in the cube scenario). Check the postgresql.conf file in the /opt/gaussdb/data directory. The IP address in localhost is the initial IP address. Change the IP address to the current IP address of the standby VRM and perform step 4 again. The GaussDB process is restored.

 144242gdqvrr9d78gawgxk.png?图片.png

The IP address in localhost should be the current IP address of the VRM.

 144248fuggyrbdyl7ubkgb.png?图片.png

Root Cause

The configuration file of the standby VRM database is abnormal due to abnormal power-off.

Solution

Generally, if the GaussDB service of the standby VRM is abnormal, you can rebuild the standby VRM to rectify the fault.

Run the gs_ctl build command on the standby VRM to synchronize data from the active VRM. If an error is reported, analyze the gs_ctl-current log file and view the detailed error information.


  • x
  • convention:

Login and enjoy all the member benefits

Login and enjoy all the member benefits

Login