To recover parity check failure slot 1 should be reset, it's a temporary solution:
reset slot 1
Because it's a software bug of V100R003C00SPC200, software version of switch should be upgraded to V100R006C00SPC800 plus latest patch. It's a final solutionRoot Cause
1. packets that arrive to slot 1 CPU were checked by capturing in hidden mode. Did not find any abnormality:
display gfpi catch slot 1 stat-pkt-info receive srcmac
RECEIVE SRC-MAC : 0x00000030 --- 286e-d494-3699
0x000000a0 --- 50c5-8d11-9852
0x00000019 --- 286e-d4aa-9a42
2. CPU usage of slot 1 was checked:
dis cpu-us slot 1
CPU Usage Stat. Cycle: 60 (Second)
CPU Usage : 66% Max: 87%
CPU Usage Stat. Time : 2012-10-03 03:10:43
CPU utilization for five seconds: 66%: one minute: 66%: five minutes: 66%.
TaskName CPU Runtime(CPU Tick High/Tick Low) Task Explanation
...............................................................
VPS 19% 0/cbf2d83b VPS
...............................................................
bcmDPC 20% 0/debdce58 tS0c
...............................................................
OS 8% 0/5ac74e7f Operation System
When slot 1 has parity check failure, the following processes have high usage of CPU time: VPS, bcmDPC and OS.
3. log files were checked detailly. Based on alarms it was found that slot 1 has parity checking failure, it caused print a huge amount of logs, this process used much resources of slot 1 CPU:
Jul 8 2013 17:50:10+07:00 %%01ANTIATTACK/4/RPC_ALL_FAILED(D)[192501]:Failed to send RPC messages to all the boards.
Jul 8 2013 17:50:10+07:00 %%01SDKE/3/ERR(D)[192502]:Slot=1;Slot1 layer DK module NONE level ERR: unit 0 EGR_IPFIX_SESSION_TABLE entry 2119 parity error
Such log message means, that table EGR_IPFIX_SESSION_TABLE has 2119 error entries because of parity check failure at LPU in slot 1. Thus, large amount of logs is generated due to high amount of error entries in the table.
Solution
Suggestions
If equipment has high CPU usage, check whether it generates huge amount of log messages. If it does, please eliminte the reason of such messages. In this case log messages were generated due to slot 1 parity check failure.