Hi, guys.
Today, I will share the common method of Linux log collection and analysis.
Hope this can help you in your work.
Log collection
Use the SmartKit tool to collect information about a single server or in batches.
For details, see section "Log Collection" in the SmartKit User Guide.
https://support.huawei.com/enterprise/en/doc/EDOC1100068214/e666b74d

Common Log analysis
1. OS Message Log Analysis
The message logs are stored in system\var\log. Analyze the logs based on the fault occurrence time.

2. Mcelog log analysis
The mcelog log is stored in system\var\log. Analyze the log based on the fault occurrence time.
If the word "Corrected error" is displayed, the CE can correct errors and hardware does not need to be replaced. For V3 servers, check the BIOS version to determine whether the ECC is not suppressed due to an earlier version (earlier than 3.53).
3. Operating system configuration and service information:
system\etc

4. OS GRUB boot information:
The Grub information is stored in the \system\boot directory.

5. Run the "dmidecode" command to query the mainboard information.
dmidecode
6. Run the following command to check the driver version:
lsmod
modinfo
7. Viewing NIC parameter settings and firmware versions
The directory in the NIC directory is related to the NIC port.
ethtool
ifconfig
lspci
That is all, Thank you for reading.