Hi team, here's a new guide.
1 Vdbench Tool
1.1 How to Set DirectIO
For the Linux operating system, there are two disk access modes: asynchronous I/O and direct I/O. I/Os pass through the host cache, I/Os may be split or aggregated, and the number of concurrent I/Os may change.
Therefore, to prevent the operating system from changing the I/O model, you must specify the directIO access mode during the performance test, that is, set openflags to o_direct in sd.
As shown in the following example:
sd=default,openflags=o_direct,threads=32
1.2 How Do I Set the Cache Hit Ratio?
Set the rhpct (read cache hit ratio) and whpct (write cache hit ratio) parameters in the WD.
The implementation principle is as follows: A hit area (set in SD and 1 MB by default) is created in the block device to be tested.
When Vdbench needs to construct a cache hit, it generates an I/O that reaches the hit area, and the data to be accessed is resident in the cache.
A cache miss is constructed by using a large enough miss area and using as random I/O access as possible. For example:
wd=wd1,sd=sd*,seekpct=100,rdpct=50,xfersize=64K,rhpct=60
In the preceding example, the read cache hit ratio of all SDs is set to 60%.
That is, 60% of read I/Os are hit in the cache but not written to disks.
1.3 How to Set the I/O Ratio
During the performance test, you may need to test multiple I/O models.
You need to set the I/O ratio in different models by setting the skew parameter in the WD.
As shown in the following example:
wd=wd1,sd=(sd1-sd8, sd11-sd18),seekpct=100,rdpct=100,xfersize=512K,skew=95
wd=wd2,sd=(sd9, sd10, sd19, sd20),seekpct=50,rdpct=0,xfersize=224K,skew=5
Pay attention to the following points:
1. The total number of SKWs in multiple WDs must be 100. Otherwise, an error message is displayed.
2. If iorate (target IOPS) in rd is set to max, the Skew in wd may be inconsistent with the setting. Set iorate to a fixed value to ensure that the skew ratio is normal.
2 dd Commands
2.1 How to Set DirectIO
In the Linux operating system, you need to set the DirectIO mode to perform the performance test.
If the dd command is not set to DirectIO, the following problems may occur:
1. A large I/O may be split into multiple small I/Os by the host cache. For example, a 512 KB large I/O may be split into multiple 4 KB small I/Os in the host cache. I/Os of the max_sectors_kb size are combined at the block device layer. This may cause abnormally high host CPU utilization.
2. During a DD write operation, I/Os are delivered to the host cache and returned.
3. During DD write operations, if a single concurrent DD process is used, the split I/Os are delivered to the storage array at multiple times due to the host cache policy. It does not reflect the actual array performance.
The setting method is as follows:
For test write performance
dd if=/dev/zero of=/testw.dbf bs=4k oflag=direct count=100000
For test read performance
dd if=/dev/sdb of=/dev/null bs=4k iflag=direct count=100000
2.2 Why a single dd process often fails to reach the maximum bandwidth?
Because the dd command is a single concurrent sequential I/O, for a single dd process,
Bandwidth = I/O size x IOPS = I/O size x 1000 ms/Average latency
According to the formula, the average delay is an important factor that affects the bandwidth.
Because of the host capability, network transmission, storage processing, and disk read/write speed, the average latency is usually at the millisecond level, which limits the maximum bandwidth of a single dd process.
Assume that the average latency is about 3 ms and the I/O size is 512 KB.
Bandwidth = 512 KB x 1000/3 = 166 MB
If you want to test the maximum storage bandwidth using DD, you are advised to start multiple DD processes at the same time.
In addition, on Linux OSs of some versions, if the scheduling policy of a block device is the default CFQ, multiple DD processes are concurrent. You can change the scheduling policy of a block device to NOOP.