This post is about a database being Read-Only. For more information, please see below.
Applicable versions
FusionInsight V100R002C70*
FusionInsight V100R002C80*
Diagnostic
The cluster is switched to read-only mode. Run the following statement to check the transaction mode of the cluster:
show default_transaction_read_only;
If true is returned, the cluster is in read-only state.
Cause analysis
The cluster monitors disk space usage. If the disk space usage of a node reaches the threshold (90% by default), the cluster is set to read-only state to prevent disk space usage from reaching 100% and causing cluster service unavailability. In this case, find out the cause of the high disk space usage and free up some space.
l Cause 1: Too many Xlogs are archived.
Perform the following steps to locate the cause:
Check the disk usage of each node.
for IP in `source /opt/huawei/Bigdata/mppdb/.mppdbgs_profile&&cm_ctl view|grep datanodeListenIP|uniq|awk -F ":" '{print $2}'`;do ssh -q ${IP} "hostname|sed 's/$/',${IP}'<-----''&/g'|sed 's/^/&''\n\n----->''/g';echo -e '\n';df -h";done;
Check whether the disk usage exceeds 90%. If It is, the database is switched to read-only mode due to high disk usage.
Step 1. Check whether the archive mode is enabled for the cluster.
Run the show archive_mode ; statement. on is returned.
Step 2. Ensure that no archive command is configured for the cluster.
Run the show archive_command; statement. The result is empty.
Step 3. Check the total size of the Xlogs of the DN on the disk with the highest usage.
Go to the directory of the corresponding DN.
cd pg_xlog
du -sh ./*
Check the total size of files in the pg_xlog directory.
Step 4. Calculate the disk space used by the pg_xlog directory.
Assume that the data disk size is 1.7 TB, the size of pg_xlog in the DN path is 1.2 TB, and other files in this path are more than 0.3 TB. In this case, the usage of the data disk exceeds 90%.
----End
The archiving function is enabled, but no archive command is configured. As a result, logs in the pg_xlog directory of the primary DN accumulate, occupying much disk space, and the cluster switches to the read-only mode.
Perform the following steps to troubleshoot the problem:
Step 1. Disable the archive mode.
gs_guc set -Z coordinator -Z datanode -N all -I all -c 'archive_mode = off'
In V100R007C10, you can directly reload the cluster without restarting it.
Step 2. Run checkpoint ;.
checkpoint ;
After the command is successfully executed, the cluster immediately releases the space occupied by archived logs. Verify that the pg_xlog log space in the DN instance directory has been reclaimed.
Step 3. Manually disable the read-only mode of the cluster.
gs_guc reload -Z coordinator -Z datanode -N all -I all -c "default_transaction_read_only = false"
----End
l Cause 2: The disk is full due to data skew.
For details, see 6.2.6 Database Is Read Only.