1. Make sure that there are redundant links between HOSTs and storage, then we did the test to exchange the SFP module between the normal FC port and problematic FC port, we found that the problem is related with the FC port, not SFP module;
2. Collect “operating data” and “system log” on the device manage portal;
3. In the “operating data” file, about FC port, we found that the SFP module can be recognized, but the status is offline, we can see the hardware can be discovered and there may be some problem about the software function;

4. According to the alarm time or the occurring time of the problem, we can check the “messages” logs, and search the key word “SFP” and “fibre module”, then we found the information as below:

5. With the key information above, we can see the SFP was removed at 10:08:11 physically, and the logical status became “link-down” 6 seconds later at 10:08:17, during the period, system started the checking of the speed about the SFP module at 10:08:14, the mismatch speed and link-up status appeared at the same time, then our system disabled the FC port.
Root Cause
The changing time of the logical status of the FC port is delayed, occasionally the system is checking the speed of the port at that time and finds the mismatch speed, and then the problem that the FC port is disabled with low probability happens.
Solution
The command to enable the FC port manually is not developed for current version V200R002C00, so we have to reset the controller to restore the FC port, when both controllers are normal, we can reset it one by one; for the permanent solution, we can upgrade the system software to V200R002C20SPC200 or higher
Suggestions
In the logs, we can see there are lots of records about re-inserting the SFP module with high frequency, and then the problem happened occasionally, so when the device is running normally, we suggest not re-inserting SFP modules with high frequency.