Hi team!
Here’s a case that a Single-Controller alarm cannot be cleared on storage array after the host is restarted.
Symptoms
The UltraPath V100R006 is installed on a Linux host.
After the host is connected to a storage array using fiber cables,
a single-controller alarm indicating intermittent link disconnections is generated on the storage array.
After the host is restarted and link connections become normal, the alarm does not clear.
Alarm Information
A single-controller alarm indicating intermittent link disconnections is generated on the storage array.
Possible Causes
1.After detecting the alarm trigger condition, the UltraPath in user mode pushes the single-controller alarm to its kernel.
After receiving the single-controller alarm, the kernel adds it to a list for continuously pushing the alarm to the storage array.
2.After link connections become normal and the single-controller alarm needs to be cleared, the UltraPath pushes the clear alarm to its kernel.
After receiving the clear alarm, the kernel adds it to a list for continuously pushing the alarm to the storage array, but it also removes the clear alarm from the current alarm file to the historical alarm file.
3.In the case of intermittent link disconnections, the UltraPath kernel fails to push the clear alarm to the storage array due to a path selection failure.
4.After the host is restarted, the UltraPath kernel does not continue to push the clear alarm to the storage array, because the retry queue has been deleted and previous alarm information is not maintained in user mode any more. Therefore, the single-controller alarm does not clear.
Identification Method
This problem can be identified if the following conditions are met:
The UltraPath V100R006 is installed on the Linux host.
The storage array and host are connected using Fibre Channel links.
After the Fibre Channel link of a controller is disconnected, a single-controller alarm is generated.
The host is restarted before the link connection is recovered.
The single-controller alarm does not clear.
The single-controller alarm still exists even after the link connection becomes normal.
Perform the following steps to verify that the link connection is recovered:
Check whether the cable is removed or damaged.
If the cable is removed, reinsert it.
If the cable is damaged, replace it.
Log in to the ISM and check whether the controller works properly.
If the controller is faulty, go to Step 4.
If the controller works properly, go to Step 3.
Check whether the switch works properly.
If the switch is faulty, contact the switch vendor.
If the switch works properly, the link connection is recovered and the alarm should be cleared.
Contact technical support engineers for help.
Solution
This problem has no impact on services. Manually clear the alarm on the storage array.
Check After Recovery
After clearing the alarm, verify that the alarm does not exist in alarm information.
Application Scope
Linux host + UltraPath V100R006 + Fibre Channel networking