Got it

How Do I Collect the Dump Information About a VM Crash?

621 0 0 0 0

[Description]: The libvirt log analysis result shows that the VM is restarted after the crash occurs in the VM. In this case, you need to collect the dump logs of the VM for further analysis.

[Fault Mode]: The VM is restarted after the crash occurs in the VM. In this case, you need to collect the dump logs of the VM for further analysis.

[Applicable Version]: FusionCompute versions

[Procedure]:


1.To disable the function of restarting a Linux VM after it is crashed, perform the following operations to change the crash policy of the VM from the default restart mode to the preserve mode:

If the VM crashes again, the VM will be suspended, and you can collect VM dump logs

2. Run the following command to modify the XML configuration of the VM online:

sed -i "1,$ s/on_crash>restart/on_crash>preserve/g" /var/run/libvirt/libxl/i-0000000x.xml

i-0000000x indicates the ID of the target VM, which can be queried on the FusionCompute web client.

3.After the configuration of all VMs to be modified on the host is modified, run the following command to restart the libvirt service for the modification to take effect:

service libvirtd restart

If the VM is stopped and then started, the configuration is restored to the default configuration, that is, the VM restarts immediately if it crashes. However, the configuration still takes effect if the VM is restarted or live migrated.

Note: If you perform the operations on the live network, check whether the host is abnormal before restarting the libvirt service. The specific troubleshooting measures are as follows:

  1. Check whether processes in the D state exist on the host. If any process is in the D state, contact Huawei technical support.

ps aux | grep -w D | grep -v grep

Note: If a process in the D state exists, the corresponding process information is displayed. If no command output is displayed, no process in the D state exists on the host.

If the version of FusionCompute is V100R600C00* and the memory overcommitment policy is disabled, check whether the currentMemory and memory configurations of the VM on the host are consistent before restarting the libvirt service. If any VM is configured with inconsistent values, contact Huawei technical support.

ids=`virsh list | grep 000 | awk -F" " '{print $1}' `; for id in $ids; do a=`virsh dumpxml $id | grep -w currentMemory | awk -F">" '{print $2}' | awk -F"<" '{print $1}' `; b=`virsh dumpxml $id | grep -w memory | awk -F">" '{print $2}' | awk -F"<" '{print $1}' `; if [$a != $b]; then virsh list | grep -w $id; fi; done

Note: If there are VMs with inconsistent configurations, the command output will display the VM IDs. If no command output is displayed, no VM with inconsistent memory configurations exists.

After the function of restarting a Linux VM after it is crashed is disabled, perform the following operations to collect dump information:

5.Log in to the background of the host where the VM is located as the root user and run the df -h command to check the available space. The command output shows that the available space in /POME/datastore_3 is 160 GB.

093009oloxzd6s202msxmx.png?1.png


Log in to the background of the host where the VM is located as the root user. Run the xl list command to check the memory of the VM and determine the ID of the VM to be dumped. The following uses the VM whose ID is 5 as an example.

093127s5fj0fvwcdx170d1.png?2.png


Assume that the name of the faulty VM is i-00000069 and the VM ID is 5. The command is shown in the following figure.

093148u6o1csvv080b017s.png?3.png


Send the generated memory dump log to R&D engineers for analysis. Then, switch to the /POME/datastore_3 directory and delete the generated test.dump file to release space.

If the dump file of the VM has been collected and copied, restart the VM to restore the VM





Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.