[Problem Information]
Table 1 Basic information about the problem
Information Name: Slow read/write speed Caused by Full Host Queue
Storage Type Converged Storage
Product Version All Versions
Slow response to fault types
Slow response to keywords, Linux operating system
[Symptom]
Scenario 1: Run the iostat -x 2 command to check host I/Os. It is found that the usage of some drive letters reaches 100% (the penultimate column) and the disk read/write speed is extremely low.

Scenario 2:
In iSCSI networking, the dd command is used to write data. The bandwidth is only about 8 Mbit/s, which is extremely low. The read performance is normal. The read/write rate of the local hard disk is normal.
After multiple dd processes are written, each process is only about 3 MB, which is slow to write.

[Troubleshooting Roadmap]
Scenario 1:
1. Collect storage performance data or remotely view performance data. Check whether the average latency of storage LUNs and controllers is normal.
2. The average service time of block devices is small (svctm in the penultimate column) and the average waiting time is long (await value in the penultimate row of iostat). Run the lsscsi –l command multiple times to check whether the queue depth of queue_depth is greater than 16. If the queue depth of the host HBA is low, I/O queuing is excluded.

3. Run the top command to check the board process usage. It is found that some processes (such as ds_agent) occupy much memory corresponding to other processes. Confirm with the customer that the tool is deployed by the customer. The specific function is unknown. After the customer kills the process on all boards, services are restored.
Scenario 2:
1. No exception is found in scenario 1. The storage response delay is normal. The current system is an iSCSI network with CE switches in the middle.
2. Check the CPU usage of a large number of software interrupts.

3. Check whether the network adapter does not have the system version. If the CentOS system is installed, the network adapter driver is provided by the CentOS system. The network adapter driver is incompatible.

4. After the driver is installed, you need to manually run rmmod and modprobe for the driver to take effect. Run the following command to check whether the driver version can be queried:

5. The read and write tests are normal.

[Cause]
Scenario 1: The customer's tool process ds_agent generates a large number of small I/Os, causing queue congestion on the upper-layer host and slow service response.
Scenario 2: Incompatible NIC drivers cause extremely low write bandwidth.
[Solution]
Scenario 1: Kill useless processes after communicating with the customer and frontline personnel.
Scenario 2: Uninstall the driver again.
[Post-Recovery Check]
Scenario 1: Services are restored, the iostat wait time decreases, and the usage decreases to a low value.
Scenario 2: The read/write latency and bandwidth are normal.
[Suggestion and Summary]
None
[Applicability]
All versions