Let's walk through the process of replacing an LSI SAS3108 RAID controller in a Huawei 2288H V5 server running as part of a VMware vSphere cluster. In my example, I will use a cluster with vSphere DRS enabled with automatic power management. ESXi 6.7 is installed on the server. This means that virtual machines running on this cluster automatically migrate between physical servers depending on the amount of free resources on them. This achieves a more even load on the servers. In addition, at low load, "extra" servers are turned off to save energy.
First, you need to put the server in maintenance mode so that the cluster does not try to power on it. Since in my case the server was turned off, I had to turn it on.

For some time after turning on the server does not participate in the redistribution of virtual machines, and if the cluster decides that its capacity is excessive, it will automatically turn off this server. During this pause, we transfer the server to maintenance mode.

Next power off the server.

While the server is shutting down, go to its iBMC. Here we will need to turn on the UID indicator (if there is a chance of getting confused in the location of the server in the rack) and check that it is turned off correctly. First, click the "Steady On" or "Blink" button in the UID indicator section.

When the indicator turns on (or starts blinking), the virtual UID indicator status will behave in the same way on the main iBMC page.

Next, go to the Remote Console section and use the capabilities of our server to remotely connect to its screen. In V5 we can use HTML5 for this, which I like much more than Java (before V5 I have to use Java).



At the moment, we have a correctly shutdown server, on which the UID is flashing. Next we go to the server room. You need to take new parts with you, a Phillips and slotted screwdrivers (the latter is needed to open the server upper cover if you closed it with a rotary latch), a knife for opening the package and some protection against static.
When I came to the server room, I found that while I was walking, the UID had already stopped flashing.

Therefore, I took additional steps to identify the correct server in order to avoid the error: I checked the serial number on the pull-out tongue and turned on the UID again.


Next, we go around the server from behind. We need to disconnect the power cables from it. We check again that we are disconnecting the desired server using the UID - it can be more difficult to identify the required server behind. Especially if you use different server models or are mounted alongside storage systems. In front, they form a flat surface and are easy to count, and in the back they can be of different lengths and hide under each other.


Now you need to access the top cover of the server. For this I just have to slide it out without disconnecting the rest of the cables. To extend, push the latches on both sides in front and pull the server towards you.


When the server is fully extended (ejected?) on the slide, you can start opening the top cover. It helped me a lot to have a "cable hand" that holds all the cables connected to the server while moving it along the sled - I didn't have to disconnect everything.


The replacement kit is as follows. Main controller board, memory card board and battery in case of unexpected power off.

The controller is located approximately in the middle of the server motherboard and is not mounted in a standard PCI-Express slot, but in a special one - it is inserted from top to bottom and fixed with two screws.

Disconnect the SAS cables and unscrew the retaining screws, then disconnect the backup battery cable.

Here is the back of the controller board. At the bottom right there is a gray connector that connects it to the server's motherboard.

The same board contains a board with a memory card. When replacing the controller, it will need to be screwed on. Also, in my case, the old memory card did not need to be returned, so I had to remove it from the controller before packing it for shipping.

Then we remove the battery. It is fixed to the air duct with latches.


We proceed to install new parts. The battery comes with two cables of different lengths. We choose the one that suits us and put the battery in place.

Next we need to connect the battery to the board with the memory card. Pay attention to the location of the pins on the connector. I had to check myself with photos - contacts are at the bottom. In addition, it seemed to me that it is easier to do this on a removed board - a very tight connector.

Then we fasten the controller to the server motherboard and insert the SAS cables into it.


Next we close the server cover, put it back and connect the power cables.

Next, turn on the server. Mine one turned on by itself, because it is set to turn on automatically after the power returns. After a complete power outage, the server may turn on longer than usual, because the iBMC starts up first. I didn’t have to wait too long - while I was laying the cables better and walking around the rack, it was already "going to the start." We go back to the iBMC panel and check the absence of errors.


In my case, the new controller recognized the old RAID configuration and booted correctly. All I have to do is turn off maintenance mode in the cluster.

