Hello everyone,
Today I'm going to show you how to handle the problem that switching to another NTP server takes about 100 minutes when the old NTP server is down.
Issue Description
It will take about 100 minutes for switching to another NTP server when the old one is shutdown.
Client configuration:

Handling Process
We reproduce the issue in the Laboratory, analyze of clock source switching mechanism. Our current implementation is to compute the latest eight sets of data by round trip packet sampling timestamp, calculate offset, delay, and dispersion triples.
Triplet calculation method:
Offset = [(T2-T1) + (T3-T4)] / 2
Delay = (T4 - T1) - (T3 - T2)
Dispersion = server-side precision + client-side precision + 15ppm * (T4-T1)
Get these triples, will use the filtering algorithm to get the filtered triples, and then use the triples to calculate the synchronization distance (synchronization distance)
Distance = (delay / 2) + dispersion
If the calculated synchronization distance> 1 (S), it means that the server is not suitable as a synchronization source.
The lab measured the data in detail. After the master server was down, the data was collected six times and the distance was calculated by using the eight data collected earlier than 1, switching the clock source.
Nov 21 2017 10:40:07.790.1 f00347256_156.225 NTP/7/NTP_DBG_Pkt4_Send: FileId [ 084 ], LineNo [ 00169 ]- packet to 189.53.218.118 (port: 123) from 189.53.156.225 (port: 123) via Ethernet0/0/0 leap: 0, version: 3, mode: 3 stratum: 9, poll: 64, precision: 2^18 rdel: 8.820, rdsp: 30.701, refid: 189.53.218.118 reftime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) orgtime: 10:39:02.654 UTC Nov 21 2017(DDBE80C6.A79E060F) rectime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) xmttime: 10:40:07.790 UTC Nov 21 2017(DDBE8107.CA41D4B5) | 1th:FileId [ 019 ], LineNo [ 01434 ]- Calculated Root Distance : 0.040 [ Peer Delay : 0.009, Peer Root delay : 0.000, Peer Root Dispersion : 0.011, Peer Dispersion : 0.001, Peer Variance : 0.000, Current time in seconds : 408249, Time elapsed from last samples received in seconds : 408157 ]
|
Nov 21 2017 10:41:12.730.3 f00347256_156.225 NTP/7/NTP_DBG_Pkt4_Send: FileId [ 084 ], LineNo [ 00169 ]- packet to 189.53.218.118 (port: 123) from 189.53.156.225 (port: 123) via Ethernet0/0/0 leap: 0, version: 3, mode: 3 stratum: 9, poll: 64, precision: 2^18 rdel: 8.820, rdsp: 38.818, refid: 189.53.218.118 reftime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) orgtime: 10:39:02.654 UTC Nov 21 2017(DDBE80C6.A79E060F) rectime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) xmttime: 10:41:12.737 UTC Nov 21 2017(DDBE8148.BCC2C5E3) | 2th:FileId [ 019 ], LineNo [ 01434 ]- Calculated Root Distance : 0.041 [ Peer Delay : 0.009, Peer Root delay : 0.000, Peer Root Dispersion : 0.011, Peer Dispersion : 0.001, Peer Variance : 0.000, Current time in seconds : 408313, Time elapsed from last samples received in seconds : 408157 ] |
Nov 21 2017 10:42:15.760.2 f00347256_156.225 NTP/7/NTP_DBG_Pkt4_Send: FileId [ 084 ], LineNo [ 00169 ]- packet to 189.53.218.118 (port: 123) from 189.53.156.225 (port: 123) via Ethernet0/0/0 leap: 0, version: 3, mode: 3 stratum: 9, poll: 64, precision: 2^18 rdel: 6.927, rdsp: 39.764, refid: 189.53.156.82 reftime: 10:41:38.765 UTC Nov 21 2017(DDBE8162.C3E6D58C) orgtime: 10:39:02.654 UTC Nov 21 2017(DDBE80C6.A79E060F) rectime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) xmttime: 10:42:15.772 UTC Nov 21 2017(DDBE8187.C5B63D3D) | 3th:FileId [ 019 ], LineNo [ 01434 ]- Calculated Root Distance : 0.042 [ Peer Delay : 0.009, Peer Root delay : 0.000, Peer Root Dispersion : 0.011, Peer Dispersion : 0.001, Peer Variance : 0.000, Current time in seconds : 408377, Time elapsed from last samples received in seconds : 408157 ] |
Nov 21 2017 10:43:19.790.2 f00347256_156.225 NTP/7/NTP_DBG_Pkt4_Send: FileId [ 084 ], LineNo [ 00169 ]- packet to 189.53.218.118 (port: 123) from 189.53.156.225 (port: 123) via Ethernet0/0/0 leap: 0, version: 3, mode: 3 stratum: 9, poll: 64, precision: 2^18 rdel: 4.028, rdsp: 40.726, refid: 189.53.156.82 reftime: 10:42:42.768 UTC Nov 21 2017(DDBE81A2.C4C9B845) orgtime: 10:39:02.654 UTC Nov 21 2017(DDBE80C6.A79E060F) rectime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) xmttime: 10:43:19.806 UTC Nov 21 2017(DDBE81C7.CE8826AA) | 4th:FileId [ 019 ], LineNo [ 01434 ]- Calculated Root Distance : 0.043 [ Peer Delay : 0.009, Peer Root delay : 0.000, Peer Root Dispersion : 0.011, Peer Dispersion : 0.001, Peer Variance : 0.000, Current time in seconds : 408442, Time elapsed from last samples received in seconds : 408157 ] |
Nov 21 2017 10:44:25.790.2 f00347256_156.225 NTP/7/NTP_DBG_Pkt4_Send: FileId [ 084 ], LineNo [ 00169 ]- packet to 189.53.218.118 (port: 123) from 189.53.156.225 (port: 123) via Ethernet0/0/0 leap: 0, version: 3, mode: 3 stratum: 9, poll: 64, precision: 2^18 rdel: 4.028, rdsp: 41.718, refid: 189.53.156.82 reftime: 10:43:47.801 UTC Nov 21 2017(DDBE81E3.CD310129) orgtime: 10:39:02.654 UTC Nov 21 2017(DDBE80C6.A79E060F) rectime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) xmttime: 10:44:25.800 UTC Nov 21 2017(DDBE8209.CCE453D2) | 5th:FileId [ 019 ], LineNo [ 01434 ]- Calculated Root Distance : 0.044 [ Peer Delay : 0.009, Peer Root delay : 0.000, Peer Root Dispersion : 0.011, Peer Dispersion : 0.001, Peer Variance : 0.000, Current time in seconds : 408505, Time elapsed from last samples received in seconds : 408157 ] |
Nov 21 2017 10:45:30.790.3 f00347256_156.225 NTP/7/NTP_DBG_Pkt4_Send: FileId [ 084 ], LineNo [ 00169 ]- packet to 189.53.218.118 (port: 123) from 189.53.156.225 (port: 123) via Ethernet0/0/0 leap: 0, version: 3, mode: 3 stratum: 9, poll: 64, precision: 2^18 rdel: 6.134, rdsp: 42.694, refid: 189.53.156.82 reftime: 10:44:50.796 UTC Nov 21 2017(DDBE8222.CBCE52DE) orgtime: 10:39:02.654 UTC Nov 21 2017(DDBE80C6.A79E060F) rectime: 10:39:02.650 UTC Nov 21 2017(DDBE80C6.A66F3F52) xmttime: 10:45:30.801 UTC Nov 21 2017(DDBE824A.CD3CFF65) | 6th:FileId [ 019 ], LineNo [ 01434 ]- Calculated Root Distance : 8.014 [ Peer Delay : 0.000, Peer Root delay : 0.000, Peer Root Dispersion : 0.011, Peer Dispersion : 8.003, Peer Variance : 0.000, Current time in seconds : 408545, Time elapsed from last samples received in seconds : 408545 ] Distance>1,switch NTP source: Nov 21 2017 10:45:30.790.5 f00347256_156.225 NTP/7/NTP_DBG_Selection: FileId [ 019 ], LineNo [ 00510 ]- Server 189.53.218.118 not considered for clock selection. (Reason: Synchronization distance greater than distance threshold. (Current: 8.01429, Allowable: 1.00048)) |
Root Cause
After the NTP clock is stabilized, the polling interval between servers is gradually changed to 1024s. After the main server goes DOWN, the NTP needs to go through a certain number of samples. By calculating the sampling data to obtain the synchronization distance> 1s, the NTP server can be switched. The entire sampling period takes a certain amount of time.
The implementation is in line with the standard. When the master clock source is in a stable state for a long time, a sudden failure occurs. The time for switching the clock source is also normal for 100 minutes. The clock on the device within 100 minutes is not too much error.
Solution
It is normal, it should be like this.