Hi team, here's a new case.
Problem Symptom
A storage pool is faulty.
Problem Diagnosis
Check whether the faulty disks or nodes in the storage pool have redundant disks or nodes.
Check whether the storage pool fault is caused by the fault that is not rectified last time.
Check whether there is a partition whose values of all pt_status fields are not ok in the view.
If yes, this section is applicable.
Causes
The last faulty node or disk cannot be recovered and services on it cannot be started.
The last faulty node or disk stores the latest data. Other OSD nodes need to synchronize data from the node or disk when they are recovered. If data synchronization fails, the OSD nodes cannot be recovered.
This problem is a design defect.
Solution
Check whether the current product version is 8.0.2 and the ID of the faulty storage pool (ID of the disk pool for 8.0.2).
If the version is 8.0.2, run the mdc_cmd.sh 1842 pool_id command on a node in the control cluster to rectify the fault.
If the version is 8.0.1, run the mdc_cmd.sh 1840 pool_id command to rectify the fault.
Run the mdc_cmd.sh 120 pool_id command to check whether the storage pool status is recovered. If no, run the preceding recovery command again.
If the fault persists, contact technical support engineers.
Check After Recovery
The storage pool status becomes normal.
Suggestion and Summary
Locate the problem based on the current service model.
The emergency recovery command is a high-risk command.
After the command is executed, data in the storage pool may be lost. Therefore, exercise caution when running this command.
Applicable Versions
FusionStorage 8.0.1, OceanStor 100D 8.0.2