DiskSpd 是一个由微软开发的强大的存储性能测试工具，主要用于测试磁盘、存储设备和系统的 I/O 性能。它是一个命令行工具，允许用户模拟不同的负载模式（如顺序读取、顺序写入、随机读取、随机写入等）来评估磁盘或存储系统的性能。DiskSpd 具有高度的可配置性，可以让用户自定义许多参数，以进行精确的性能测试。

Release DISKSPD 2.2 · microsoft/diskspd · GitHub

什么是 DiskSpd？

DiskSpd 是一个由微软开发的强大的存储性能测试工具，主要用于测试磁盘、存储设备和系统的 I/O 性能。它是一个命令行工具，允许用户模拟不同的负载模式（如顺序读取、顺序写入、随机读取、随机写入等）来评估磁盘或存储系统的性能。DiskSpd 具有高度的可配置性，可以让用户自定义许多参数，以进行精确的性能测试。

为什么要使用 DiskSpd？

高效性：DiskSpd 是由微软开发的，具有高度的优化，能够充分发挥现代存储系统的性能，提供非常准确的性能数据。
功能强大：DiskSpd 可以模拟多种不同的 I/O 模式，包括顺序读/写、随机读/写、混合工作负载等，并支持调整块大小、线程数、文件大小、缓存行为等大量测试参数。
广泛的支持：它支持多个 Windows 操作系统版本，可以在多种硬件配置和存储介质上运行，包括硬盘、固态硬盘（SSD）和其他存储设备。
灵活的配置：DiskSpd 提供了非常灵活的配置选项，可以用来创建简单或复杂的测试场景，帮助用户深入了解存储设备的性能特点。
适用于企业和开发者：企业 IT 基础设施、存储管理员和开发人员都可以使用 DiskSpd 来测试和优化存储系统，或者评估存储硬件在不同负载下的表现。

DiskSpd 的主要特点和功能

多种 I/O 模式：
- 顺序读/写（Sequential Read/Write）
- 随机读/写（Random Read/Write）
- 混合负载（Mix Read/Write，按比例设定读写比例）
灵活的测试配置：
- 文件大小、块大小、I/O 请求大小
- 线程数、文件数量
- 每个线程的 I/O 操作数量
- 磁盘缓存的控制选项（如启用或禁用缓存）
性能度量：
- IOPS（每秒输入输出操作数）
- 吞吐量（Throughput，即带宽，通常以 MB/s 或 GB/s 测量）
- 延迟（Latency，即每次操作的时间，通常以毫秒 ms 为单位）
系统资源的深度控制：
- 可自定义磁盘、CPU 绑定，精细控制测试期间资源的使用。
- 支持事件同步与等待，适用于精确控制测试时间。
多线程与多文件支持：
- 支持使用多个线程进行并发 I/O 操作，模拟多用户或高并发的工作负载。
- 可以指定多个测试文件，模拟更复杂的测试场景。
支持日志和报告：
- DiskSpd 提供了详细的日志文件，能够记录测试过程中的所有性能指标。
- 结果可以通过命令行或脚本进行进一步分析，生成性能报告。

DiskSpd 的使用场景

硬件性能评估：DiskSpd 常用于评估硬盘、SSD 等存储设备在不同负载下的性能，帮助系统管理员和硬件工程师选择合适的硬件设备。
存储优化：对于数据库管理员、虚拟化专家等，DiskSpd 可以用于评估存储性能，帮助优化存储配置，确保系统能够高效运行。
开发和测试：软件开发人员，特别是存储相关的开发人员，使用 DiskSpd 来模拟真实的负载，测试应用程序在不同存储条件下的表现。
故障排查：DiskSpd 可以用来测试存储系统在出现性能瓶颈或故障时的反应，帮助定位性能问题。

总结：为什么使用 DiskSpd？

DiskSpd 是一个功能强大且灵活的存储性能测试工具，适用于企业 IT 基础设施、存储管理员、硬件工程师、开发人员等。它能够模拟多种负载模式，帮助用户全面评估磁盘及存储系统的性能，检测潜在瓶颈，进行故障排查，或优化存储配置。无论是硬件评估、性能调优，还是存储系统测试，DiskSpd 都是一个非常有用的工具。

DiskSpd 工具按功能分类的表格：

功能类别	选项	描述
基本配置	`-c<size>`	创建一个指定大小的文件
	`-d<seconds>`	运行测试的持续时间（秒）
	`-b<size>`	设置块大小（例如 `4K`, `8K` 等）
	`-t<num>`	每个文件的线程数
	`-o<num>`	每个线程的重叠 I/O 操作数量
	`-r`	随机访问模式
	`-a<cpu_mask>`	设置线程与 CPU 的绑定（例如 `0,1` 表示绑定到 CPU 0 和 CPU 1）
文件与数据源	`-c<size>`	创建指定大小的文件
	`-X<filepath>`	使用 XML 配置文件来配置测试参数
	`-Z<size>`	使用指定大小的随机填充缓冲区进行写操作
	`-Zr`	对每次 I/O 操作使用随机填充的缓冲区
同步控制	`-ys<eventname>`	在实际测试开始前触发指定事件（不进行热身）
	`-yf<eventname>`	在实际测试结束后触发指定事件（不进行冷却）
	`-yr<eventname>`	在测试开始前等待指定事件（包括热身）
	`-yp<eventname>`	当指定事件触发时停止测试；可以通过 CTRL+C 绑定此事件
	`-ye<eventname>`	设置指定事件并退出
性能计时器与事件	`-e<q	c
	`-ep`	使用分页内存进行内核日志记录（默认使用非分页内存）
	`-ePROCESS`	跟踪进程的开始与结束
	`-eTHREAD`	跟踪线程的开始与结束
	`-eIMAGE_LOAD`	跟踪图像加载事件
	`-eDISK_IO`	跟踪物理磁盘 I/O 操作
	`-eMEMORY_PAGE_FAULTS`	跟踪所有页面错误
	`-eMEMORY_HARD_FAULTS`	跟踪硬错误
	`-eNETWORK`	跟踪 TCP/IP 和 UDP/IP 的发送与接收
	`-eREGISTRY`	跟踪注册表操作
缓存与优化	`-Sh`	禁用所有缓存机制
	`-Sz`	禁用磁盘缓存
随机种子与填充	`-z[seed]`	设置随机种子（默认种子为 0）
	`-Zr`	使用随机填充的缓冲区进行写操作
多文件与多线程	`-t<num>`	每个文件的线程数（可以使用多个文件进行测试）
	`-a<cpu_mask>`	设置多线程与多个 CPU 的绑定
高级功能	`-X<filepath>`	使用 XML 配置文件来配置测试参数（可以配置多个目标）

备注说明：

-c<size> 用于指定文件大小，可以是字节、KB、MB、GB 等。
-b<size> 用于设置块大小，通常为 4KB、8KB 或更大。
-t<num> 表示测试中线程的数量，每个文件可以有不同数量的线程。
-r 指定进行随机访问（默认是顺序访问）。
-o<num> 指定每个线程的重叠 I/O 操作数，适用于负载较大的测试。
同步控制选项（如 -ys, -yf, -yr 等）用于在测试前后与外部事件进行同步操作，常用于精确控制测试启动与结束时机。

这个表格概括了 diskspd 工具的大多数常见用法及其功能类别，有助于快速查看和配置各种测试参数。

C:\Users\Administrator\Downloads\DiskSpd 2.2\amd64>diskspd /?

Usage: diskspd [options] target1 [ target2 [ target3 ...] ]
version 2.2.0 (2024/6/3)

Valid targets:
file_path
#<physical drive number>
<drive_letter>:

Sizes, offsets and lengths are specified as integer bytes, or with an
optional suffix of KMGT (KiB/MiB/GiB/TiB) or b (for blocks, see -b).
Examples: 4k = 4096
with -b4k, 8b = 32768 (8 * 4KiB)

Available options:
-? display usage information
-:<flags> experimental behaviors, as a bitmask of flags. current:
1 - allow throughput rate limit sleeps >1ms if indicated by rate
-ag group affinity - threads assigned round-robin to CPUs by processor groups, 0 - n.
Groups are filled from lowest to highest processor before moving to the next.
[default; use -n to disable default affinity]
-a[g#,]#[,#,...]> advanced CPU affinity - threads assigned round-robin to the CPUs stated, in order of
specification; g# is the processor group for the following CPUs. If no group is
stated, 0 is default. Additional groups/processors can be added, comma separated,
on the same or separate -a parameters.
Examples: -a0,1,2 and -ag0,0,1,2 are equivalent.
-ag0,0,1,2,g1,0,1,2 specifies the first three CPUs in groups 0 and 1.
-ag0,0,1,2,g1,0,1,2 and -ag0,0,1,2 -ag1,0,1,2 are equivalent.
-b<size> IO size, defines the block 'b' for sizes stated in units of blocks [default=64K]
-B<base>[:length] bounds; specify range of target to issue IO to - base offset and length
(default: IO is issued across the entire target)
-c<size> create file targets of the given size. Conflicts with non-file target specifications.
-C<seconds> cool down time - duration of the test after measurements finished [default=0s].
-D<milliseconds> Capture IOPs statistics in intervals of <milliseconds>; these are per-thread
per-target: text output provides IOPs standard deviation, XML provides the full
IOPs time series in addition. [default=1000, 1 second].
-d<seconds> duration (in seconds) to run test [default=10s]
-f<size> maximum target offset to issue IO to (non-inclusive); -Bbase -f(base+length) is the same
as -Bbase:length. For example, to test only the first sectors of a disk.
-f<rst> open file with one or more additional access hints
r : the FILE_FLAG_RANDOM_ACCESS hint
s : the FILE_FLAG_SEQUENTIAL_SCAN hint
t : the FILE_ATTRIBUTE_TEMPORARY hint
[default: none]
-F<count> total number of threads (conflicts with -t)
-g<value>[i] throughput per-thread per-target throttled to given value; defaults to bytes per millisecond
With the optional i qualifier the value is IOPS of the specified block size (-b).
Throughput limits cannot be specified when using completion routines (-x)
[default: no limit]
-h deprecated, see -Sh
-i<count> number of IOs per burst; see -j [default: inactive]
-j<milliseconds> interval in <milliseconds> between issuing IO bursts; see -i [default: inactive]
-I<priority> Set IO priority to <priority>. Available values are: 1-very low, 2-low, 3-normal (default)
-l Use large pages for IO buffers
-L measure latency statistics
-n disable default affinity (-a)
-N<vni> specify the flush mode for memory mapped I/O
v : uses the FlushViewOfFile API
n : uses the RtlFlushNonVolatileMemory API
i : uses RtlFlushNonVolatileMemory without waiting for the flush to drain
[default: none]
-o<count> number of outstanding I/O requests per target per thread
(1=synchronous I/O, unless more than 1 thread is specified with -F)
[default=2]
-O<count> number of outstanding I/O requests per thread - for use with -F
(1=synchronous I/O)
-p start parallel sequential I/O operations with the same offset
(ignored if -r is specified, makes sense only with -o2 or greater)
-P<count> enable printing a progress dot after each <count> [default=65536]
completed I/O operations, counted separately by each thread
-r[align] random I/O aligned to [align] byte offsets within the target range (overrides -s)
[default alignment=block size (-b)]
-rd<dist>[params] specify an non-uniform distribution for random IO in the target
[default uniformly random]
distributions: pct, abs
all: IO% and %Target/Size are cumulative. If the sum of IO% is less than 100% the
remainder is applied to the remainder of the target. An IO% of 0 indicates a gap -
no IO will be issued to that range of the target.
pct : parameter is a combination of IO%/%Target separated by : (colon)
Example: -rdpct90/10:0/10:5/20 specifies 90% of IO in 10% of the target, no IO
next 10%, 5% IO in the next 20% and the remaining 5% of IO in the last 60%
abs : parameter is a combination of IO%/Target Size separated by : (colon)
If the actual target size is smaller than the distribution, the relative values of IO%
for the valid elements define the effective distribution.
Example: -rdabs90/10G:0/10G:5/20G specifies 90% of IO in 10GiB of the target, no IO
next 10GiB, 5% IO in the next 20GiB and the remaining 5% of IO in the remaining
capacity of the target. If the target is only 20G, the distribution truncates at
90/10G:0:10G and all IO is directed to the first 10G (equivalent to -f10G).
-rs<percentage> percentage of requests which should be issued randomly; -r is used to specify IO alignment.
Sequential IO runs are homogeneous when a mixed r/w ratio is specified (-w) and their lengths
follow a geometric distribution based on the percentage (chance of next IO being sequential).
-R[p]<text|xml> output format. With the p prefix, the input profile (command line or XML) is validated and
re-output in the specified format without running load, useful for checking or building
complex profiles.
[default: text]
-s[i][align] stride size of [align] bytes, alignment & offset between operations
[default=non-interlocked, default alignment=block size (-b)]
By default threads track independent sequential IO offsets starting at base offset of the target.
With multiple threads this results in threads overlapping their IOs - see -T to divide
them into multiple separate sequential streams on the target.
With the optional i qualifier (-si) threads interlock on a shared sequential offset.
Interlocked operations may introduce overhead but make it possible to issue a single
sequential stream to a target which responds faster than one thread can drive.
(ignored if -r specified, -si conflicts with -p, -rs and -T)
-S[bhmruw] control caching behavior [default: caching is enabled, no writethrough]
non-conflicting flags may be combined in any order; ex: -Sbw, -Suw, -Swu
-S equivalent to -Su
-Sb enable caching (default, explicitly stated)
-Sh equivalent -Suw
-Sm enable memory mapped I/O
-Su disable software caching, equivalent to FILE_FLAG_NO_BUFFERING
-Sr disable local caching, with remote sw caching enabled; only valid for remote filesystems
-Sw enable writethrough (no hardware write caching), equivalent to FILE_FLAG_WRITE_THROUGH or
non-temporal writes for memory mapped I/O (-Sm)
-t<count> number of threads per target (conflicts with -F)
-T<offs> starting separation between I/O operations performed on the same target by different threads
[default=0] (starting offset = base target offset + (thread number * <offs>)
only applies to -s sequential IO with #threads > 1, conflicts with -r and -si
-v[s] verbose mode - with s, only provide additional summary statistics
-w<percentage> percentage of write requests (-w and -w0 are equivalent and result in a read-only workload).
absence of this switch indicates 100% reads
IMPORTANT: a write test will destroy existing data without a warning
-W<seconds> warm up time - duration of the test before measurements start [default=5s]
-x use completion routines instead of I/O Completion Ports
-X<filepath> use an XML file to configure the workload. Profile defaults for -W/d/C (durations) and -R/v/z
(output format, verbosity and random seed) may be overriden by direct specification.
Targets can be defined in XML profiles as template paths of the form *<integer> (*1, *2, ...).
When run, specify the paths to substitute for the template paths in order on the command line.
The first specified target is *1, second is *2, and so on.
Example: diskspd -d60 -Xprof.xml first.bin second.bin (prof.xml using *1 and *2, 60s run)
-z[seed] set random seed [with no -z, seed=0; with plain -z, seed is based on system run time]

Write buffers:
-Z zero buffers used for write tests
-Zr per IO random buffers used for write tests - this incurrs additional run-time
overhead to create random content and shouln't be compared to results run
without -Zr
-Z<size> use a <size> buffer filled with random data as a source for write operations.
-Z<size>,<file> use a <size> buffer filled with data from <file> as a source for write operations.

By default, write source buffers are filled with a repeating pattern (0, 1, 2, ..., 255, 0, 1, ...)

Synchronization:
-ys<eventname> signals event <eventname> before starting the actual run (no warmup)
(creates a notification event if <eventname> does not exist)
-yf<eventname> signals event <eventname> after the actual run finishes (no cooldown)
(creates a notification event if <eventname> does not exist)
-yr<eventname> waits on event <eventname> before starting the run (including warmup)
(creates a notification event if <eventname> does not exist)
-yp<eventname> stops the run when event <eventname> is set; CTRL+C is bound to this event
(creates a notification event if <eventname> does not exist)
-ye<eventname> sets event <eventname> and quits

Event Tracing:
-e<q|c|s> Use query perf timer (qpc), cycle count, or system timer respectively.
[default = q, query perf timer (qpc)]
-ep use paged memory for the NT Kernel Logger [default=non-paged memory]
-ePROCESS process start & end
-eTHREAD thread start & end
-eIMAGE_LOAD image load
-eDISK_IO physical disk IO
-eMEMORY_PAGE_FAULTS all page faults
-eMEMORY_HARD_FAULTS hard faults only
-eNETWORK TCP/IP, UDP/IP send & receive
-eREGISTRY registry calls

Examples:

Create 8192KB file and run read test on it for 1 second:

diskspd -c8192K -d1 testfile.dat

Set block size to 4KB, create 2 threads per file, 32 overlapped (outstanding)
I/O operations per thread, disable all caching mechanisms and run block-aligned random
access read test lasting 10 seconds:

diskspd -b4K -t2 -r -o32 -d10 -Sh testfile.dat

Create two 1GB files, set block size to 4KB, create 2 threads per file, affinitize threads
to CPUs 0 and 1 (each file will have threads affinitized to both CPUs) and run read test
lasting 10 seconds:

diskspd -c1G -b4K -t2 -d10 -a0,1 testfile1.dat testfile2.dat

posted @ 2024-12-29 22:52 suv789 阅读(526) 评论(0) 收藏举报

刷新页面返回顶部

suv789