Got it

Analysis of Controller Self-Healing Caused by File System Rename Exceptions

81 0 1 0 0

[Problem Description]

When NAS services are used, the host performs operations similar to move a/b/b a/ on the file system, triggering controller self-healing. After the self-healing, the controller cannot be powered on.


[Symptom Description]

The controller performs self-healing and cannot be powered on, and NAS services of the file system cannot be used.


[Cause]

1. Perform the move a/b/b a/ operation on the file system. The V3 storage system locks the source and destination files (a/b/b, a/). If a file with the same name exists on the destination end, delete the file and move the file. Files a and b are locked twice before deletion. In this example, a and b are locked twice, causing a deadlock. As a result, storage controller B hangs I/Os. (The following uses the file system on controller B as an example.)

2. After controller B performs self-healing, services are switched to controller A. The host delivers the move command again, causing I/Os to be suspended on controller A. During the power-on process of controller B, the image cache of controller A is required. In V3R2, the hangIO operation is performed on all file systems in batches. (Suspend the I/Os delivered by the host and wait for the I/Os being executed to end.) Because I/Os have been mounted to the file system, all file systems on controller A are in the hangIO state and cannot be stopped. All NAS services are suspended. In addition, controller B cannot obtain the image of the peer controller. As a result, controller B fails to start.


[Location Method]

Search the message log for Heal print (pal_counter) massage begin: and check whether the statistics LOCK_SPC_DMNFOR_READ and LOCK_SPC_DMNFOR_WRITE have returned I/Os. If there are both, this is the problem.

1


[Emergency Measures]

Restart the two controllers at the same time. That is, run the rebootsys command in the minisystem to restart the two controllers.


[Solution]

Upgrade the device to V300R003C10SPC100 or a later version.


[Post-Recovery Check]

NAS services of the storage system are running properly.


[Appendix]

NA


Comment

You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.