Got it

Massive cameras are offline and cannot log in to IVS client for about half an hour

Latest reply: Feb 13, 2020 07:53:24 625 1 0 0 0

Hello, guys!

Have a nice day!

I encountered a problem before and it is resolved now. I want to share it with you!

symptom

Massive cameras are offline and cannot log in IVS client for about half an hour.

Analysis

  1. Analyze the SMU log and find the service restarts at 12:44:41, 12:56:43, and 13:52:02.

  2. Check the server detection script log and find the server time is hopped at 12:36:57, system time is changed from 12:32:06 to 12:36:57, which triggers the service restart protection mechanism and restarts all services modules.

  3. Analyze the server NTP logs and confirm the time is hopped at 12:36:57.

  4. Due to the master server is a two-node VMU, to judge the service running status, the two-node software needs to grasp the status of all services, and as the start-stop time of each service is different, it takes about 20 minutes to finish two-node VMU restart.

  5. Because other MPUs have configured VMU float IP as the NTP clock source, after NTP time sync to each MPU, the time of the MPU server would also be changed, thus services restarted and led to cameras offline.

  6. Confirm that the customer site configures a windows NTP clock source server, and the customer adjusts NTP server time around 12:30, which causes VCN server time rollover.

  7. After the services of two-node VMU recover, massive CU/eSDK users log in. And because MPU services just recover at that time, and then begin to connect with SMU, after reconnection, SMU would report all the online user information to each MPU’s SCU module.

  8. The report process costs a long time due to plenty of online users and device groups, exceeding the maximum thread detection time: 700s.


Root Cause

  1. The time hopping of the NTP server causes VCN server time change, which makes VCN service modules to restart automatically, IPCs go offline, and IVS client login failed.

  2. During the process of services recovering, because incident report consumed a long time, triggering SMU thread restart for the second time. After all servers have finished time synchronization, services recover.


Solution

  1. Modify the NTP parameter of two-node VMU to change the NTP time sync method to micro-synchronization, that is, only sync 5 milliseconds each time.

  2. Extend the SMU thread time-out limit of the two-node VMU  to 1000s by modifying file: /home/ivs_smu/config/service.xml.


That is all, thanks for reading!



thanks for sharing.
View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.