[Description]: The libvirt log analysis result shows that the VM is restarted after the crash occurs in the VM. In this case, you need to collect the dump logs of the VM for further analysis.
[Fault Mode]: The VM is restarted after the crash occurs in the VM. In this case, you need to collect the dump logs of the VM for further analysis.
[Applicable Version]: FusionCompute versions
[Procedure]:
1.To disable the function of restarting a Linux VM after it is crashed, perform the following operations to change the crash policy of the VM from the default restart mode to the preserve mode:
If the VM crashes again, the VM will be suspended, and you can collect VM dump logs
2. Run the following command to modify the XML configuration of the VM online:
sed -i "1,$ s/on_crash>restart/on_crash>preserve/g" /var/run/libvirt/libxl/i-0000000x.xml
i-0000000x indicates the ID of the target VM, which can be queried on the FusionCompute web client.
3.After the configuration of all VMs to be modified on the host is modified, run the following command to restart the libvirt service for the modification to take effect:
service libvirtd restart
If the VM is stopped and then started, the configuration is restored to the default configuration, that is, the VM restarts immediately if it crashes. However, the configuration still takes effect if the VM is restarted or live migrated.
Note: If you perform the operations on the live network, check whether the host is abnormal before restarting the libvirt service. The specific troubleshooting measures are as follows:
Check whether processes in the D state exist on the host. If any process is in the D state, contact Huawei technical support.
ps aux | grep -w D | grep -v grep
Note: If a process in the D state exists, the corresponding process information is displayed. If no command output is displayed, no process in the D state exists on the host.
If the version of FusionCompute is V100R600C00* and the memory overcommitment policy is disabled, check whether the currentMemory and memory configurations of the VM on the host are consistent before restarting the libvirt service. If any VM is configured with inconsistent values, contact Huawei technical support.
ids=`virsh list | grep 000 | awk -F" " '{print $1}' `; for id in $ids; do a=`virsh dumpxml $id | grep -w currentMemory | awk -F">" '{print $2}' | awk -F"<" '{print $1}' `; b=`virsh dumpxml $id | grep -w memory | awk -F">" '{print $2}' | awk -F"<" '{print $1}' `; if [$a != $b]; then virsh list | grep -w $id; fi; done
Note: If there are VMs with inconsistent configurations, the command output will display the VM IDs. If no command output is displayed, no VM with inconsistent memory configurations exists.
After the function of restarting a Linux VM after it is crashed is disabled, perform the following operations to collect dump information:
5.Log in to the background of the host where the VM is located as the root user and run the df -h command to check the available space. The command output shows that the available space in /POME/datastore_3 is 160 GB.

Log in to the background of the host where the VM is located as the root user. Run the xl list command to check the memory of the VM and determine the ID of the VM to be dumped. The following uses the VM whose ID is 5 as an example.

Assume that the name of the faulty VM is i-00000069 and the VM ID is 5. The command is shown in the following figure.

Send the generated memory dump log to R&D engineers for analysis. Then, switch to the /POME/datastore_3 directory and delete the generated test.dump file to release space.
If the dump file of the VM has been collected and copied, restart the VM to restore the VM