Got it

How to replace SSD card

Created: Jun 27, 2019 03:01:39Latest reply: Jun 27, 2019 22:22:50 533 2 0 0 0
  Rewarded HiCoins: 0 (problem resolved)

How to replace NVMe SSD , what is the replacement procedures ?

Because we have not experience it on FusionCube.

Product is FusionCube 6000, ES3000 V2 is the SSD card.

Featured Answers
wissal
MVE Created Jun 27, 2019 22:22:50

Hello!

Please find below how to replace NVMe SSD and the replacement procedure:

Determine the status of the faulty device.

1. On the disk topology page of the storage pool, perform the required operation based on the status of the faulty device (hard disk or SSD device used as main storage).

You can perform the following operations to query the troubleshooting method. Click the faulty device and click Query Troubleshooting Method displayed. In the displayed dialog box, click OK. Click the faulty device again, and the recommended troubleshooting method will be displayed. Figure 2-4shows the key operation steps.

Figure 2-4.  Querying the troubleshooting method recommended by the system 
How to replace SSD card-2980973-1

  • If the device is not removed from the storage pool and Repair is displayed on the page, restore the device by performing operations provided in the FusionStorage Block Storage Service Emergency Handling Guide. If the device is removed from the storage pool after the restoration, select the device and click Add to Storage Pool to add it to the storage pool again.

  • If the device is not removed from the storage pool, and the ALM-51003 Faulty Storage Pool alarm is generated, restore the device by performing operations provided in the FusionStorage Block Storage Service Emergency Handling Guide. If the device is removed from the storage pool after the restoration, select the device and click Add to Storage Pool to add it to the storage pool again.

  • If the device is not removed from the storage pool and You need to forcibly replace the medium to rectify the fault is displayed on the page, go to 2. The ignoreMediaFault value is set to false.

  • If the device is not removed from the storage pool, and the ALM-51003 Faulty Storage Pool alarm is not generated, go to 2. The ignoreMediaFault value is set to true.

  • If the device has been removed from the storage pool and Medium status error. Change the medium. is displayed on the page, go to 2.

Replace the faulty device.

Perform the required operation based on whether the faulty device needs to be powered off.

How to replace SSD card-2980973-2 NOTE:

V3 and V5 SSDs can coexist in the same storage pool but cannot coexist on the same server.


Table 2-1 shows the methods of powering off devices of different storage media.

Table 2-1.  Methods of powering off devices[tr]Device Medium Type
Disk Type Displayed on FusionStorage Block Self-Maintenance Platform
Power Off Method
[/tr]
SAS disks, SATA disks, and SSDs
SAS Disk/SATA Disk/SSD Disk
Do not need to be powered off.
PCIE SSD cards (non-NVMe protocol)
SSD Card/NVMe SSD
Power off the server.
NVMe SSD cards
SSD Card/NVMe SSD
Power off the server.
NVMe SSDs
SSD Card/NVMe SSD
Perform a logical power-off.
How to replace SSD card-2980973-3

If the SSD card or SSD is to be replaced, take note of its electronic serial number (ESN) displayed on the FusionStorage Block Self-Maintenance Platform before replacement.

If disks added to a FusionStorage storage pool are required to form redundant array of independent disks (RAID) 0, perform operations provided in the server documentation. If disks in RAID 0 are hot-swapped, manually activate RAID 0 to add the disks to the system. Otherwise, the disks cannot be identified by the system. For details, see the server documentation.

  • If the faulty device can be replaced without a power-off, replace it simply.

  • If the faulty device is an NVMe SSD, you do not need to power off the server. However, you need to perform the following operations to logically power off the faulty NVMe SSD for replacement:

  • On FusionStorage Block Self-Maintenance Platform, choose Hardware > Disks.
  • Locate the row that contains the NVMe SSD to be replaced and click Power Off, as shown in Figure 2-5.
    Figure 2-5  Logical power-off 
    How to replace SSD card-2980973-4
  • Click Yes.
    How to replace SSD card-2980973-5 NOTE:
    If the logical power-off fails, power off the server and then replace the faulty device.
  • If the faulty device can be replaced only after the server is powered off, perform the required operation based on the deployment scenario of FusionStorage.
    How to replace SSD card-2980973-6 NOTE:

    • If partial SSD cards on the server that provides storage resources for FusionStoragebecome faulty and need to be replaced, place the server into maintenance mode before the replacement and remove the server out of maintenance mode after the replacement. For details, see Placing a Storage Node into Maintenance Mode and Removing a Storage Node from Maintenance Mode.
    • If all SSD cards on the server that provides storage resources for FusionStorage become faulty, you do not need to place the server into maintenance mode.
    • If a server is placed into maintenance mode, the timeout duration in which the server is removed will be prolonged by 45 minutes. Therefore, you will have 75 minutes to replace the faulty parts. If a server is removed from the storage pool for more than 75 minutes, the storage pool will reconstruct data.

    • If FusionStorage is deployed in the FusionSphere system, take service protection measures and then replace the faulty server by performing operations provided in Parts Replacement in the FusionSphere product documentation.
      For example, if the target server is deployed in converged mode, you need to migrate the VMs on this server to other hosts before powering off the server.
    • If FusionStorage is deployed in other systems, take measures to protect running services, power off the faulty server, and replace it with a new one.

Determine the status of the hardware device.
On the disk topology page of the storage pool, perform the required operation based on the status of the faulty device (hard disk or SSD device used as main storage).
  • If the replaced device is not an NVMe device and has been automatically added to the storage pool and runs properly, no further action is required.
  • If the replaced device is an NVMe device and has been automatically added to the storage pool and runs properly, go to 5.
  • If the faulty device has been removed from the storage pool, perform the operation based on the faulty device type. After the operation is complete, if the device runs properly, no further action is required.If the replaced device is an NVMe device and hardware DIF was enabled before the fault occurs, you need to enable hardware DIF after the device is replaced but before it is added to the storage pool. For details, see the FusionStorage Block Storage Service Hardware DIF Configuration Guide.
    • If PCIe or NVMe SSDs are faulty, select the faulty device, click Add to Storage Pool and select the new device as prompted.As shown in Figure 2-6How to replace SSD card-2980973-7 indicates the faulty device, and How to replace SSD card-2980973-8 indicates the new device. You can select How to replace SSD card-2980973-9 and perform operations.
      Figure 2-6  Disk topology page 
      How to replace SSD card-2980973-10
    • If other disks are faulty, select the new device, click Add to Storage Pool, and add the device to the storage pool as prompted.
  • If the faulty device has not been removed from the storage pool, go to 4.

Restore storage resources.
  • In the CLI of the active FusionStorage Manager (FSM) node you have logged in to as user dsware, run the required command based on the main storage type to restore storage resources:
    How to replace SSD card-2980973-11 NOTE:
    Since the system has been hardened, you need to enter the username and password for login authentication after running the dswareTool command of FusionStorage Block. The default username is cmdadmin, and its default password is IaaS@PORTAL-CLOUD9!.
    The system supports authentication using environment variables so that you do not need to repeatedly enter the username and password for authentication each time you run the dswareTool command. For details, see Authentication Using Environment Variables.
    To restore a SATA or SAS disk, or an SSD, run the following command: sh /opt/dsware/client/bin/dswareTool.sh --op forceReplaceSingleDisk -id Storage pool ID -slot Slot number -nodeMgrIp Management IP address -ignoreMediaFault true/false
    To restore an SSD card, run the following command: sh /opt/dsware/client/bin/dswareTool.sh --op forceReplaceSSD -id Storage pool ID -oldEsn ESN of the faulty SSD card -newEsn ESN of the new SSD card -nodeMgrIp Management IP address -type main_storage -ignoreMediaFault true/false
    If the replaced device is not an NVMe device, no further action is required.

Enable hardware DIF.

If the replaced device is an NVMe device and hardware DIF was enabled before the fault occurs, you need to enable hardware DIF after the device is replaced but before it is added to the storage pool.

  • Remove the storage node.If the new device is automatically added to the storage pool, you need to remove the node housing the new device from the storage pool and then enable hardware DIF. For details, see Storage Pool Capacity Reduction.
  • Enable hardware DIF.After the target storage node is removed from the storage pool, enable hardware DIF. For details, see the FusionStorage Block Storage Service Hardware DIF Configuration Guide.
  • Add the storage node.After hardware DIF is enabled, add the node to the storage pool again. For details, see Expanding Capacity of a Storage Pool.

Please refer to the link:

View more
  • x
  • convention:

All Answers
hello, sir, pls check this guide document

Cache Faults (NVMe SSD Card)




Scenarios

Replace hardware and restore services when a cache fault (NVMe SSD card) occurs.


Impact on the System

During the SCNA replacement, the SCNA O&M services are interrupted.


Prerequisites



Conditions
•Spare parts of the original model and specifications are available for replacement.
•You have located the server and labeled its panel to avoid misoperations.

Data



Table 1 Required data


Category

Data

Default Value

Example Value


BMC Management IP address
-

https://192.168.1.22

User name
root

-

Password
Huawei12#$

-

FusionStorage Block Management IP address
-

https://192.168.8.162:28443/fsportal

User name
admin

-

Password
Huawei@CLOUD8!

-

VMware vCenter Management IP address
-

https://192.168.8.68

User name •The user name and password for logging in to vCenter vary with the .ova file you use.
•Obtain the user name and password from the VMware official website.

•vSphere 5.5: root


•vSphere 6.0: administrator@vsphere.local


Password
•vSphere 5.5: vmware


•vSphere 6.0: Huawei@123




Procedure



Enter the FusionStorage Block maintenance mode.

1.Enable the host to enter the FusionStorage Block maintenance mode. For details, see Configuring the Host Maintenance Mode (FusionStorage Block).


Migrate VMs (VMware vSphere).

2.Access the real-time interface of the host by connecting to the physical device or remote virtual console.
3.Check the host status and perform related operations.

•If the host OS is running properly and the communication between the local PC and the host network is normal, go to Migrating VMs (VMware vSphere).


•If the host OS is not running properly or the communication between the local PC and the host network is abnormal, contact Huawei technical support.





Shut down the management VM.

4.Select the management VM, and click Power off the virtual machine on the Getting Started tab.
A confirmation dialog box is displayed.

5.Click yes.
The management VM is shut down.



Enter the VMware vSphere maintenance mode.

6.Enable the host to enter the VMware ESXi maintenance mode. For details, see Configuring the Host Maintenance Mode (VMware vSphere).


Replace hardware.

7.Replace faulty hardware. For details, see Parts Replacement.


Check the hardware installation status.

8.Use PuTTY and the following parameters to log in to the host CLI.
•IP address: management IP address of the host
•User name: fc2


9.Run the following command and enter the root user password to switch to user root.
su - root

10.Run the following command to disable logout on timeout:
TMOUT=0

11.Run the following command to check the SN of the NVMe SSD device:
hioadm info

12.Run the following command to check the health status of the NVMe SSD device:
hioadm info -d nvme0

NOTE:
The query of the NVMe0 health status is used as an example.


13.Perform operations according to the status.




Status (Device Status)

Description

Operation


OK Normal. The hardware is replaced successfully. -
NOT OK Abnormal. The hardware is faulty. Go to 14.
BLANK Abnormal. No hardware is detected. Go to 15.


14.(Optional) Perform operations as prompted.
15.(Optional) Replace the hardware again.
16.(Optional) Check whether the health status of the NVMe SSD device is normal again.
•If yes, the replacement is successful.
•If no, contact Huawei technical support.




Exit the VMware vSphere maintenance mode.

17.Enable the host to exit the VMware ESXi maintenance mode. For details, see Configuring the Host Maintenance Mode (VMware vSphere).


Configure the NVMe SSD pass-through function again.

18.Double-click vClient, and enter the vCenter IP address, user name and password.
19.Select the ESXi host of the faulty node, choose Configuration > Advanced Settings, click the editing button, deselect Non-Volatile memory controller in the list, click ok, and restart the host for the configuration to take effect.
20.After the system is started, select the ESXi host of the faulty node in vCenter again, choose Configuration > Advanced Settings, click the editing button, select Non-Volatile memory controller in the list, click ok, and restart the host for the configuration to take effect.
21.After the system is started, right-click the CVM of the faulty node.
22.Choose Edit Settings. Remove the NVMe SSD card from the PCI device list.
•If the VM starts with the host, end the configuration process.
•If the VM does not start with the host, perform 23.


23.Click Add, select Non-Volatile memory controller in the PCI device list, and add it to the VM.


Check the VM status (VMware vSphere).

24.Log in to the VMware vCenter. For details, see Logging In to the VMware vCenter.
25.Choose Inventory > Host and Clusters.
The Host and Clusters page is displayed.

26.Check whether the VM status is Powered On.
If the VM status is Powered Off, perform the following operations to start the VM:
a.Right-click the row where the VM to be operated is located, and choose Power.
b.Click Power On.




Exit the FusionStorage Block maintenance mode.

27.Enable the host to exit the FusionStorage Block maintenance mode. For details, see Configuring the Host Maintenance Mode (FusionStorage Block).


Restore storage resources.

28.Log in to the FusionStorage Manager WebUI. For details, see Logging In to the FusionStorage WebUI.
29.Choose Resource Pool > Storage Pools > Disk Topology.
The Disk Topology page is displayed.

30.Check the status of the replaced hardware.
31.Perform operations according to the status.




Status

Description

Operation


Green solid box Normal. -
Grey dotted box Abnormal. The cache NVMe SSD card is removed from the storage pool. Go to 32.
Red solid box Abnormal. The cache NVMe SSD card needs to be restored. Go to 33.
Red dotted box Abnormal. Storage resources of the cache NVMe SSD card need to be restored. Go to 34.


32.(Optional) Rectify the grey dotted box fault.

a.Select the replaced cache NVMe SSD card.


b.Click Add to Storage Pool.



33.(Optional) Rectify the red solid box fault.

a.Select the replaced cache NVMe SSD card.


b.Click Query Troubleshooting Method.


c.Select the replaced cache NVMe SSD card.


d.Perform restoration operations as prompted.



34.(Optional) Rectify the red dotted box fault.

a.Use PuTTY and the following parameters to log in to the FusionStorage Manager CLI.•IP address: management IP address of the active node of FusionStorage Block
•The default user name is dsware
•The default password is Huawei@CLOUD8!


b.Run the following command to disable logout on timeout:

TMOUT=0


c.Run the following command to switch to the directory storing the command line tool dswareTool.sh:

cd /opt/dsware/client/bin
NOTE:

You must enter the authentication user name and password after running the dswareTool command.

The default user for authentication is cmdadmin, and the default password is cmdHuawei@123.


d.Run the following command to restore storage resources:

sh dswareTool.sh --op forceReplaceSSD -id ID -oldEsn oldEsn -newEsn newEsn -nodeMgrIp nodeMgrIp -type type
NOTE:

•ID: Storage pool ID. For details, see Resource Pool > Storage Pools > ID on the FusionStorage Manager WebUI.


•oldEsn: ESN of the faulty NVMe SSD device.


•newEsn: ESN of the new NVMe SSD device.


•nodeMgrIp: Management IP address of the host.


•type: Storage media usage. For cache, the value is cache.






Check the cache NVMe SSD card status.

35.Log in to the FusionStorage Manager WebUI. For details, see Logging In to the FusionStorage WebUI.
36.Choose Resource Pool > Storage Pools > Disk Topology.
The Disk Topology page is displayed.

37.Check whether the cache NVMe SSD card is restored.
•If yes, no further action is required.
•If no, contact Huawei technical support.


and this is the link : https://support.huawei.com/enterprise/en/doc/EDOC1000156428?idPath=7919749|7941815|23972641|250416235|21488161###
you can search the key words : replace SSD Card
View more
  • x
  • convention:

Hello!

Please find below how to replace NVMe SSD and the replacement procedure:

Determine the status of the faulty device.

1. On the disk topology page of the storage pool, perform the required operation based on the status of the faulty device (hard disk or SSD device used as main storage).

You can perform the following operations to query the troubleshooting method. Click the faulty device and click Query Troubleshooting Method displayed. In the displayed dialog box, click OK. Click the faulty device again, and the recommended troubleshooting method will be displayed. Figure 2-4shows the key operation steps.

Figure 2-4.  Querying the troubleshooting method recommended by the system 
How to replace SSD card-2980973-1

  • If the device is not removed from the storage pool and Repair is displayed on the page, restore the device by performing operations provided in the FusionStorage Block Storage Service Emergency Handling Guide. If the device is removed from the storage pool after the restoration, select the device and click Add to Storage Pool to add it to the storage pool again.

  • If the device is not removed from the storage pool, and the ALM-51003 Faulty Storage Pool alarm is generated, restore the device by performing operations provided in the FusionStorage Block Storage Service Emergency Handling Guide. If the device is removed from the storage pool after the restoration, select the device and click Add to Storage Pool to add it to the storage pool again.

  • If the device is not removed from the storage pool and You need to forcibly replace the medium to rectify the fault is displayed on the page, go to 2. The ignoreMediaFault value is set to false.

  • If the device is not removed from the storage pool, and the ALM-51003 Faulty Storage Pool alarm is not generated, go to 2. The ignoreMediaFault value is set to true.

  • If the device has been removed from the storage pool and Medium status error. Change the medium. is displayed on the page, go to 2.

Replace the faulty device.

Perform the required operation based on whether the faulty device needs to be powered off.

How to replace SSD card-2980973-2 NOTE:

V3 and V5 SSDs can coexist in the same storage pool but cannot coexist on the same server.


Table 2-1 shows the methods of powering off devices of different storage media.

Table 2-1.  Methods of powering off devices[tr]Device Medium Type
Disk Type Displayed on FusionStorage Block Self-Maintenance Platform
Power Off Method
[/tr]
SAS disks, SATA disks, and SSDs
SAS Disk/SATA Disk/SSD Disk
Do not need to be powered off.
PCIE SSD cards (non-NVMe protocol)
SSD Card/NVMe SSD
Power off the server.
NVMe SSD cards
SSD Card/NVMe SSD
Power off the server.
NVMe SSDs
SSD Card/NVMe SSD
Perform a logical power-off.
How to replace SSD card-2980973-3

If the SSD card or SSD is to be replaced, take note of its electronic serial number (ESN) displayed on the FusionStorage Block Self-Maintenance Platform before replacement.

If disks added to a FusionStorage storage pool are required to form redundant array of independent disks (RAID) 0, perform operations provided in the server documentation. If disks in RAID 0 are hot-swapped, manually activate RAID 0 to add the disks to the system. Otherwise, the disks cannot be identified by the system. For details, see the server documentation.

  • If the faulty device can be replaced without a power-off, replace it simply.

  • If the faulty device is an NVMe SSD, you do not need to power off the server. However, you need to perform the following operations to logically power off the faulty NVMe SSD for replacement:

  • On FusionStorage Block Self-Maintenance Platform, choose Hardware > Disks.
  • Locate the row that contains the NVMe SSD to be replaced and click Power Off, as shown in Figure 2-5.
    Figure 2-5  Logical power-off 
    How to replace SSD card-2980973-4
  • Click Yes.
    How to replace SSD card-2980973-5 NOTE:
    If the logical power-off fails, power off the server and then replace the faulty device.
  • If the faulty device can be replaced only after the server is powered off, perform the required operation based on the deployment scenario of FusionStorage.
    How to replace SSD card-2980973-6 NOTE:

    • If partial SSD cards on the server that provides storage resources for FusionStoragebecome faulty and need to be replaced, place the server into maintenance mode before the replacement and remove the server out of maintenance mode after the replacement. For details, see Placing a Storage Node into Maintenance Mode and Removing a Storage Node from Maintenance Mode.
    • If all SSD cards on the server that provides storage resources for FusionStorage become faulty, you do not need to place the server into maintenance mode.
    • If a server is placed into maintenance mode, the timeout duration in which the server is removed will be prolonged by 45 minutes. Therefore, you will have 75 minutes to replace the faulty parts. If a server is removed from the storage pool for more than 75 minutes, the storage pool will reconstruct data.

    • If FusionStorage is deployed in the FusionSphere system, take service protection measures and then replace the faulty server by performing operations provided in Parts Replacement in the FusionSphere product documentation.
      For example, if the target server is deployed in converged mode, you need to migrate the VMs on this server to other hosts before powering off the server.
    • If FusionStorage is deployed in other systems, take measures to protect running services, power off the faulty server, and replace it with a new one.

Determine the status of the hardware device.
On the disk topology page of the storage pool, perform the required operation based on the status of the faulty device (hard disk or SSD device used as main storage).
  • If the replaced device is not an NVMe device and has been automatically added to the storage pool and runs properly, no further action is required.
  • If the replaced device is an NVMe device and has been automatically added to the storage pool and runs properly, go to 5.
  • If the faulty device has been removed from the storage pool, perform the operation based on the faulty device type. After the operation is complete, if the device runs properly, no further action is required.If the replaced device is an NVMe device and hardware DIF was enabled before the fault occurs, you need to enable hardware DIF after the device is replaced but before it is added to the storage pool. For details, see the FusionStorage Block Storage Service Hardware DIF Configuration Guide.
    • If PCIe or NVMe SSDs are faulty, select the faulty device, click Add to Storage Pool and select the new device as prompted.As shown in Figure 2-6How to replace SSD card-2980973-7 indicates the faulty device, and How to replace SSD card-2980973-8 indicates the new device. You can select How to replace SSD card-2980973-9 and perform operations.
      Figure 2-6  Disk topology page 
      How to replace SSD card-2980973-10
    • If other disks are faulty, select the new device, click Add to Storage Pool, and add the device to the storage pool as prompted.
  • If the faulty device has not been removed from the storage pool, go to 4.

Restore storage resources.
  • In the CLI of the active FusionStorage Manager (FSM) node you have logged in to as user dsware, run the required command based on the main storage type to restore storage resources:
    How to replace SSD card-2980973-11 NOTE:
    Since the system has been hardened, you need to enter the username and password for login authentication after running the dswareTool command of FusionStorage Block. The default username is cmdadmin, and its default password is IaaS@PORTAL-CLOUD9!.
    The system supports authentication using environment variables so that you do not need to repeatedly enter the username and password for authentication each time you run the dswareTool command. For details, see Authentication Using Environment Variables.
    To restore a SATA or SAS disk, or an SSD, run the following command: sh /opt/dsware/client/bin/dswareTool.sh --op forceReplaceSingleDisk -id Storage pool ID -slot Slot number -nodeMgrIp Management IP address -ignoreMediaFault true/false
    To restore an SSD card, run the following command: sh /opt/dsware/client/bin/dswareTool.sh --op forceReplaceSSD -id Storage pool ID -oldEsn ESN of the faulty SSD card -newEsn ESN of the new SSD card -nodeMgrIp Management IP address -type main_storage -ignoreMediaFault true/false
    If the replaced device is not an NVMe device, no further action is required.

Enable hardware DIF.

If the replaced device is an NVMe device and hardware DIF was enabled before the fault occurs, you need to enable hardware DIF after the device is replaced but before it is added to the storage pool.

  • Remove the storage node.If the new device is automatically added to the storage pool, you need to remove the node housing the new device from the storage pool and then enable hardware DIF. For details, see Storage Pool Capacity Reduction.
  • Enable hardware DIF.After the target storage node is removed from the storage pool, enable hardware DIF. For details, see the FusionStorage Block Storage Service Hardware DIF Configuration Guide.
  • Add the storage node.After hardware DIF is enabled, add the node to the storage pool again. For details, see Expanding Capacity of a Storage Pool.

Please refer to the link:

View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.