Got it

OSD Start Failures Due to Directories Mounted to Disks During Storage Pool Creation

88 0 0 0 0

Hi team, here's a new case.


Problem Symptom

When a storage pool is being created, directories are mounted to a disk or partitions on the disk. 

If the disk is in use, the disk will fail to be formatted and the OSD process on the node will fail to be started.


Problem Diagnosis

  1. When a storage pool is being created, the OSD processes on some nodes fail to be started.

  2. Query the nodes on which the OSD process fails to be started and whether the storage pool creation task is successful. The storage pool creation result can be queried 25 minutes after the task is finished.

    3_en-us_image_0186107023.png

  3. Check the FSA process logs in the FSA log directory (/var/log/oam/fsa/bak/dsware.xxxx-xx-xx_xx-xx-xx.tar.gz is a historical log and /var/log/oam/fsa/run/dsware_agent.log is the current log). The logs show that disk formatting fails.

    3_en-us_image_0179692186.png

  4. View the agent_handle.log log file in the /var/log/dsware directory, error Device or resource busy is returned for the disk that fails to be formatted.

    3_en-us_image_0179692188.png

  5. Check the disk status and the record indicates that the disk has been loaded and used.

    3_en-us_image_0179692887.png

  6. Confirm that the disk is used by other services and cannot be used by storage devices.


Causes

When directories are mounted to a disk or the disk's partitions used by other services or processes, error Device resource busy will be returned during disk formatting. As a result, disk formatting fails and the OSD process on the node fails to be started.



Solution

  1. Check whether the storage pool is created successfully. The storage pool creation result can be queried 25 minutes after the task is finished.

    If yes, go to 2. If no, go to 3.

  2. Confirm that the directories that are mounted to the disk or the disk's partitions can be deleted and then delete the partitions. Log in to the FSM node and run the following command to add the node on which the OSD process fails to be started to the storage pool again:

    ./dswareTool.sh --op restoreStorageNode -ip fsaIP -p poolid

    Example: ./dswareTool.sh --op restoreStorageNode -ip xxx.xxx.xxx.xxx -p 18, where xxx.xxx.xxx.xxx indicates the management IP address of the storage node to be restored.

    3_en-us_image_0186107388.png

  3. Do not use the disk to create a storage pool or delete the directories that are mounted to the disk or the disk's partitions, delete the partitions, and then use the disk to create a storage pool again.


Check After Recovery

The storage pool status is normal.

Comment

You need to log in to comment to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.