Hi guys!!
I would like to share with you a "Status Unknown" issue on the OSN 1500B node.
Description:
The customer reported an issue about "Status Unknown" on the OSN 1500B node:

NE fails to communicate with the U2000 and the customer detects the loss of analog signals at the E1 (2Mbit/s) interfaces. U2000 displayed the following alarms:
HSC_UNAVAIL
SYNC_FAIL
DOWN_E1_AIS
T_ALOS
Due to O&M tasks, the customer reports that maintenance operator performed logical active/standby CXL board (from slot 4 to slot 5) switching test from U2000 and, without finishing this process, maintenance technicians physical removing and then inserting the main CXL board back.
After this error, normally all the boards are reset and cannot be restored to the original state and must wait until the resetting of the boards is complete (when the standby CXL board starts to work, it needs to synchronize certain data with the main CXL board in slot 5. In the switching test, the main CXL board is reset before the standby CXL in slot 4 starts to work. As a result, all the boards start to synchronize and the other boards on the same subrack are reset).
But in this case, the node OSN 1500B was locked indefinitely and cutting service because also previously maintenance technicians removed and inserted the main and standby CXL boards repeatedly several times in a short period of time without success, making the issue worse.
Reviewing the SYNC_FAIL alarm in detail:


The alarm it indicates that the communication fails during the batch backup. If the communication fails for a short time, the system automatically initiates another batch backup. If the communication fails for a long time then the backup from main SCC to standby it will not be possible.
The backup contents consist of a database, service configuration, and board performance (such as 15-minute performance and 24-hour performance) backup. So long as the backup process is interrupted for 20 seconds, the standby SCC discards all packets received previously. Hence, data should be backed up again. If the backup task is interrupted for 20 seconds, the software confirms that the backup fails. As a result, another backup should be started.
The backup process may be interrupted by other higher priority tasks (for example, any task like backup to NMS - auto task via template) or by a repeated handling procedure error as in this case.
Finally, a ticket is opened with the Huawei TAC because when Parameter 1 is 0x1F, contact Huawei engineers as HedEx indicates (iManager U2000 Product Documentation : Reference > Alarm Reference > SDH Alarm Reference):

Solution:
Huawei TAC, after analyzing the case, suggests powering off and then powering on node.
As a result, the NE returned to the "Running Status", the alarms cleared and the analog signals restored at the E1 (2Mbit/s) interfaces.

Conclusion:
After performing logical or physical active/standby CXL board switching, please be careful, never remove and then insert the main CXL board back, thus preventing the services from being affected and as in this case, after repeating several times in a short period of time this error, the node must be power off and then power on.
Issue solved.
BR


![[OptiX OSN 1500B] NE in Status Unknown-3556721-1](static/image/smiley/default/victory.gif)
