Got it

When a Single Switch Is Powered Off, the Storage Pool I/O Access Is Abnormal, the Database Cluster Is Restarted, and Services Are Abnormal

95 0 0 0 0

Symptom:

After an IB switch is powered off, I/O access in the storage pool is abnormal, the database cluster restarts, and services are abnormal.


Diagnosis:

  1. Check whether the network is faulty or multiple nodes are faulty. If no, this section is not applicable.

  2. Use PuTTY and run the ssh User name@IP address command to log in to a storage node. In the command, User name indicates the user name for logging in to the node and IP address indicates the management IP address. To run this command, you need to enter the password of the user name.

  3. Run the following command to query the MDC node to which the faulty storage pool belongs:

    mdc_cmd.sh 165 -1

    If the command output contains the mapping between the storage pool ID and the storage IP address of the owning MDC node, log in to the MDC node.

  4. Run the following command to switch to the log directory of the MDC node:

    cd /var/log/dsware/plog/mdc/bak

    Run the following command to check whether there are storage pool status logs generated around the time when the fault occurs fail to be reported:

    zcat * | grep -a "connect zk less 60s, cann't return incorrect pool status"

    If yes, rectify the fault by referring to operations described in Solution.


Cause:

After the network fault is recovered, the MDC node cannot immediately obtain the accurate status of the storage pool. Therefore, the MDC node reports the storage pool status one minute later. Upper-layer services depend on the storage pool status so that they cannot complete the startup process in a timely manner.


Solution:

  1. Run the following command to check whether the storage pool status is normal (pool_id indicates the storage pool ID):

    mdc_cmd.sh 120 pool_id

    If pool_status in the command output does not contain STATUS = POOL HAS FAILURE PT, the storage pool status is normal.

  2. Restart the affected services and check whether the services are restored.

Check After Recovery:

Check whether the upper-layer services are restored.

Comment

You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.