[Problem Description]
When a VMware ESXi 5.5 host uses Broadcom Emulex LPe12002 HBAs to connect to a storage array, services are interrupted and cannot be automatically recovered (failover). The faulty HBA port link is not switched to another normal standby path.
[Symptom Description]
The typical scenario is that the link quality is poor, the signal interference is unstable, and bit errors exist on the link. Services between hosts and storage arrays may be interrupted, bringing risks to service continuity and upgrade.
Services can be restored by removing and reinserting the optical fiber of the host HBA port. However, the problem is not completely solved and services are still interrupted occasionally. The vmkernel.log file of the ESXi server shows that the faulty HBA port continuously prints the H:0xC error code and link error code statistics.
[Cause]
On the ESXi 5.5 host side, the inbox lpfc 10.0.10.1 driver of the HBA has a bug in versions earlier than 10.2.455.0. When the DevLoss timeout period is reached in the Fabric logo state, the driver still cannot correctly return the NO_CONNECT state to the upper layer. As a result, the ESXi multipathing software cannot detect link exceptions, and storage paths cannot be switched.
In normal cases, link bit errors trigger the HBA to send a LOGOUT message to the storage device and start the device loss timer (10s by default). If the link is not recovered after the devloss timer expires, all I/Os need to return NO_CONNECT (H: 0x1): After receiving the link disconnection event, the ESXi NMP multipathing software switches the faulty link to a normal link.
[Location Method]
1. View the host HBA log file vmkernel.log and search for key logs.
"failed H:0xc" or "Failed: H:0xc"
HBA link error:

NMP multipathing error:

2. After the fault is rectified by removing and reinserting the optical fiber, the fault can be located.
[Solution]
If the problem is triggered, upgrade the HBA driver to 10.2. 455.0 or later and restart the HBA.
If services cannot be interrupted at a site, perform the following operations to rectify the fault:
Method 1:
1. Run the following command on the ESX host to bring the faulty HBA port offline:
/usr/lib/vmware/vmkmgmt_keyval/vmkmgmt_keyval -i vmhba#/Emulex -s offline -k adapter
2. Bring the faulty HBA port online again.
/usr/lib/vmware/vmkmgmt_keyval/vmkmgmt_keyval -i vmhba#/Emulex -s online -k adapter
Method 2:
Remove and insert the FC cable that connects the HBA port to the switch.
Method 3:
If fabric interconnection is not directly connected, disable and then enable the port connected to the faulty HBA on the switch. Log in to the switch on the GUI or CLI.
portdisable #portnumber
porteanble #portnumber
[Post-Recovery Check]
After the solution is implemented, services are restored. If a large number of bit errors occur on the faulty port again, services are affected, a failover is triggered, and the active service path is switched to the normal standby path.

[Appendix]
NA
[Applicability]
All OceanStor V3 series that work with the Inbox driver of VMware ESXi 5.5 (GA, Update 1, Update 2, Update 3) are affected. The involved product versions are as follows:
All V300R001 versions
All V300R002 versions
All V300R003 versions