At one of our customers, we encountered the following problem. After deleting a virtual machine (30VMs) on the VMware vmfs6 datastore (50GB), the VMware ESXi space reclamation service is under pressure (storage 2GB/s or more) triggers storage flow control, and storage performance is degraded. Adjust the recovery rate to low speed on the VMware side (should be 25MB/s), but the actual recovery rate will remain at 800MB/s.
Through analysis, we found the reasons as follows:
1. By opening a case to VMware, it is confirmed that it is a bug in VMware ESXi 6.7. In the scenario of VMware ESXi 6.7 + VMFS6, the post-zero and thick provisioning delay zero space recovery rate problem.
2. For Thick VMDK, setting the unmap space recovery rate on the host side does not take effect, so the default is performed according to the maximum unmap rate. For Thin VMDK, setting the unmap space back to the host side can only be 100MB/s. (Note: Normally The minimum recovery rate of ESXi 6.5 is 20M/S). Therefore, if a large number of VMs are deleted (such as deleting 30 50GB VMs), the array flow control will be triggered if the overall rate of unmap is still larger than the array unmap flow control specifications.
After this problem occurs, you'd better contact VMware for a solution version.
There is also a Avoidance measures: For Thin VMDK space reclamation, set the unmap space recovery minimum rate (100MB/s) on the host side, and reduce the concurrent operation of deleting virtual machines to ensure that unmap traffic is within the storage flow control specification.
Number of ESXi * Number of DataStores * 100MB/s < Number of Array Controllers * Single Control Unmap Specifications



