Got it

A Cluster Is Powered Off During Controller Expansion and Cannot Be Powered On Because the Storage Array Has Controllers of Different Versions

100 0 0 0 0

Problem Information

Table1 Problem information

Item

Description

Storage type

Enterprise storage

Product version

Dorado3000 V3;Dorado5000 V3;Dorado6000 V3;Dorado18000 V3

Fault type

Capacity expansion

Key word

Power-on Failure; Cluster Power-off; Arrays of Different Versions


Symptom

The cluster is powered off during controller expansion and cannot be powered on.

Alarm Information

N/A

Possible Causes

The system may have multiple versions during the expansion process. As a result, the cluster version cannot be determined when the cluster is powered on, causing a power-on failure.

Fault Diagnosis

  1. Log in to the primary controller of the array cluster and run the sys.sh showflowtrace 145 command to check whether version synchronization fails during the cluster power-on process. If the fault is not caused by a version synchronization failure, collect logs and contact Huawei technical support.

    1_en-us_image_0172205396.jpg

  2. Collect the logs of the primary controller. Check the omm_upd_server.log file (this log file is saved in the /OSM/log/cur_debug/omm/ directory of the storage array and the logs collected in one-click mode are in the log_controller_*.tar\msg_other.zip\Messages\omm\omm_upd_log.tgz\ directory). If the key log "Cluster poweon Sync has" or "More than one version in double controller" is generated at the power-on time, you can determine that the power-on failure is caused by inconsistent cluster version. If the power-on failure is not caused by inconsistent cluster version, collect logs and contact Huawei technical support.

Solution

  1. Log in to each controller and check the controller version. Check whether at least one controller in each engine of the storage array is of the version before controller expansion. If yes, disconnect the power cable to power off the storage array, remove the controllers that are not of the version before expansion, and then reconnect the power cable. If no, collect logs and contact Huawei technical support.

    1_en-us_image_0172205397.jpg

  2. If the cluster is successfully powered on, re-insert the controllers removed in the previous step. Wait until the last controller becomes normal (which can be checked by the showsysstatus command) before inserting the next controller. For example, if controllers 0A and 1A are removed, insert controller 0A first, wait until controller 0A is powered on and runs normally, and then insert controller 1A. If the cluster fails to be powered on, contact Huawei technical support.

Check After Recovery

After restarting all nodes, check whether the same error code appears. If no, the fault is rectified and no action is required. If yes, contact R&D engineers for assistance.

Applicable Versions

OceanStor Dorado V300R002C00 and later

Comment

You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.