Hello everyone!
This post will show you how to handle VM startup failures step by step. Please check out the information displayed below.
TROUBLESHOOTING PROCESS

1. Check whether the host status is normal.
a) Obtain the VM ID.
Log in to one of the controller nodes and run the below command to check the host where the VM is running based on the VM name. Run the below command to obtain the VM ID or obtain the VM ID from the OM portal:
nova list |grep vm_name

b) Run the below command to obtain the request ID of the startup task. The VM ID was queried in the previous step or can be obtained from the OM portal:
nova instance-action-list vm_id

c) Run the below command to obtain the host where the VM is located:
nova show vm_id |grep host

d) Check whether the host where the VM is located is running properly. Run the cps host-list |grep host_id command to check whether the host is normal. If the host is faulty, rectify the fault by referring to the alarms generated when the host is abnormal.

2. Run the below command to check whether the host component status is normal. If it's not, repair the component. If it is, go to the next step:
cps host-template-instance-list host_id

3. Log in to the host and run the below command to query the nova-compute log. Run the following command to filter the nova-compute logs and locate the error or failed information:
zgrep req-id /var/log/fusionsphere/component/nova-compute/* |grep -i fail |grep error

If information is returned, vi the compute log to view the error details. If no error is reported in the nova-compute log, check the underlying UVP log. The underlying UVP may fail to start the VM. In this case, check the libvirt log as follows.
4. View the UVP Logs of a VM.
a) Obtain the VM ID, import the environment variables to the host, and run the nova show command:
nova show vm_id | grep instance

b) Run the below command to open the VM instance log in the libvirt directory and view the error information based on the time point:
vim var/log/libvirt/qemu/instance-xxxxxxxx.log
Generally, if the instance log in the libvirt directory of the nova-compute log is incorrect, you can look for cases on form based on the error information.

5. You can also manually trigger the HA function to start the VM.
a) If the HA function is enabled, run the nova reset-state vm_id command to set the VM to the error state. Then, the HA function is automatically enabled.

b. Run python /etc/nova/nova-util/reschedule_vm.py vm_id to manually trigger VM HA for the VMs that are started from volumes. The VM is started on another host.

c. For a VM that uses an image or local disk, run the nova rebuild vm_uuid Image ID command to rebuild the VM. Exercise caution when running this command!
Hope this helps!



