Symptom:
At a site, more than 400 HIMA sockets are in the CLOSE_WAIT state on the NAS.
tcp 1 0 127.0.0.1:38224 127.0.0.1:5005 CLOSE_WAIT 11665/hima_daemon
tcp 53 0 127.0.0.1:5005 127.0.0.1:51240 CLOSE_WAIT -
tcp 1 0 127.0.0.1:42717 127.0.0.1:5005 CLOSE_WAIT 11665/hima_daemon
tcp 1 0 127.0.0.1:40706 127.0.0.1:5005 CLOSE_WAIT 11665/hima_daemon
tcp 53 0 127.0.0.1:5005 127.0.0.1:51375 CLOSE_WAIT -
tcp 1 0 127.0.0.1:56668 127.0.0.1:5005 CLOSE_WAIT 11665/hima_daemon
tcp 53 0 127.0.0.1:5005 127.0.0.1:56878 CLOSE_WAIT -
tcp 1 0 127.0.0.1:56678 127.0.0.1:5005 CLOSE_WAIT 11665/hima_daemon
A large number of residual files may cause the number of file handles opened by processes to exceed the upper limit of the OS. (Some processes have a maximum of 1024 file handles.) 2. After thousands of file handles are opened, the select function is abnormal.
Analysis:
After the TCP state transition process is reviewed, it is found that when a process is killed, for example, the process forks a subprocess after the socket is established. (The child process inherits the file handle of the parent process, including the socket opened by the parent process.) , the operating system does not reclaim the socket. In this case, the peer end closes, and the FIN message is sent to the subprocess. Generally, the subprocess does not process the close message of the socket. As a result, the CLOSE_WAIT message remains.
In Hima, a socket connection is established between A and B, and C is a subprocess of A. After A and B are killed, the socket status changes to CLOSE_WAIT.
Solution:
Modify the Hima process to ensure that no residual socket exists.