Hello, everyone!
I will share with you how to deal with the problem of identifying processes with high CPU usage.
Symptom
The CPU usage of a node is high or even exceeds the threshold.
CPU Usage Remains High
Run the top command on the involved node and press P on the keyboard (to sort the processes by CPU usage).

Run ps -ef | grep <PID of the process with high CPU usage>.
Check detailed information about the process and query its log. Check whether the high CPU usage is normal.
Run the following command to query the top 10 processes that occupy the most CPU resources:
ps aux|head -1;ps aux|grep -v PID|sort -rn -k +3|head

CPU Usage Is High Occasionally
Query the /var/log/osinfo/statistics/ps.txt file. It records the results of the ps command that is executed every minute.
The file records only the basic information of the processes.

Run the following command to query the top 10 processes that occupy the most CPU resources:
ps aux|head -1;ps aux|grep -v PID|sort -rn -k +3|head
Create the checkcpu.sh file on the involved node. logFile is the log file and delayTime is the execution interval (s).
#!/usr/bin/env bashlogFile=/var/log/Bigdata/checkCpuUsage.logdelayTime=30 # seconds between each excute, default value is 30 secondswhile( true )doecho `date` >> $logFileecho "USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND " >> $logFileps aux|head -1;ps aux|grep -v PID|sort -rn -k +3|head >> $logFilesleep $delayTimeecho " " >> $logFiledone
Run the following scripts at the background:
chmd 700 /opt/checkcpu.sh
nohup /opt/checkcpu.sh > /dev/null 2>/dev/null &
Query the /var/log/Bigdata/checkCpuUsage.log file and check for information about high CPU usage.
If a Java process is using great CPU resources, the common cause is that the memory is insufficient and garbage collection is frequently triggered, causing high CPU usage.
Stop the check script process.
On the node, run ps -ef | grep checkcpu.sh | grep -v grep. Find the PID and kill the process.
We warmly welcome you to enjoy our community!


