Problem Information
Item | Description |
Storage type | Enterprise storage |
Product version | Dorado3000 V3;Dorado5000 V3;Dorado6000 V3;Dorado18000 V3 |
Fault type | Capacity expansion |
Key word | Power-on Failure; Cluster Power-off; Arrays of Different Versions |
Symptom
The cluster is powered off during controller expansion and cannot be powered on.
Alarm Information
N/A
Possible Causes
The system may have multiple versions during the expansion process. As a result, the cluster version cannot be determined when the cluster is powered on, causing a power-on failure.
Fault Diagnosis
Log in to the primary controller of the array cluster and run the sys.sh showflowtrace 145 command to check whether version synchronization fails during the cluster power-on process. If the fault is not caused by a version synchronization failure, collect logs and contact Huawei technical support.

Collect the logs of the primary controller. Check the omm_upd_server.log file (this log file is saved in the /OSM/log/cur_debug/omm/ directory of the storage array and the logs collected in one-click mode are in the log_controller_*.tar\msg_other.zip\Messages\omm\omm_upd_log.tgz\ directory). If the key log "Cluster poweon Sync has" or "More than one version in double controller" is generated at the power-on time, you can determine that the power-on failure is caused by inconsistent cluster version. If the power-on failure is not caused by inconsistent cluster version, collect logs and contact Huawei technical support.
Solution
Log in to each controller and check the controller version. Check whether at least one controller in each engine of the storage array is of the version before controller expansion. If yes, disconnect the power cable to power off the storage array, remove the controllers that are not of the version before expansion, and then reconnect the power cable. If no, collect logs and contact Huawei technical support.

If the cluster is successfully powered on, re-insert the controllers removed in the previous step. Wait until the last controller becomes normal (which can be checked by the showsysstatus command) before inserting the next controller. For example, if controllers 0A and 1A are removed, insert controller 0A first, wait until controller 0A is powered on and runs normally, and then insert controller 1A. If the cluster fails to be powered on, contact Huawei technical support.
Check After Recovery
After restarting all nodes, check whether the same error code appears. If no, the fault is rectified and no action is required. If yes, contact R&D engineers for assistance.
Applicable Versions
OceanStor Dorado V300R002C00 and later

