TiDB v8.x 中,如果目的是排查 TiKV 写放大(Write Amplification),建议重点关注下面这些 Dashboard:
| Dashboard | 面板/指标 | 用途 | 写放大判断价值 |
| online-TiKV-Details |
RocksDB KV → Compaction Flow |
查看 Compaction 写入量 |
★★★★★ |
| online-TiKV-Details |
RocksDB KV → Pending Compaction Bytes |
判断 Compaction 是否积压 |
★★★★★ |
| online-TiKV-Details |
RocksDB KV → Write Stall |
判断是否因 Compaction 导致写阻塞 |
★★★★★ |
| online-TiKV-Details |
RocksDB KV → L0 Files Count |
判断 LSM 是否失衡 |
★★★★☆ |
| online-TiKV-Details |
RocksDB Raft → Write Flow |
判断 Raft Log 是否产生额外写放大 |
★★★★☆ |
| online-Disk-Performance |
Disk Write Throughput |
查看实际磁盘写流量 |
★★★★★ |
| online-Disk-Performance |
Disk Utilization |
判断磁盘是否被 Compaction 打满 |
★★★★☆ |
| online-Disk-Performance |
Disk IOPS |
判断随机写压力 |
★★★★☆ |
| online-TiKV-Trouble-Shooting |
RocksDB Health |
快速发现 Compaction 异常 |
★★★★☆ |
| online-TiKV-Trouble-Shooting |
Write Stall Analysis |
判断是否因写放大导致性能下降 |
★★★★★ |
| online-Performance-Write |
Prewrite Duration |
判断业务写入压力 |
★★★☆☆ |
| online-Performance-Write |
Commit Duration |
判断事务提交是否受影响 |
★★★☆☆ |
| online-PD |
Region Split |
判断频繁 Split 是否增加 SST 生成 |
★★★☆☆ |
| online-PD |
Region Count Change |
判断 Region 波动是否异常 |
★★☆☆☆ |
| online-TiKV-Summary |
Store Write Ops/QPS |
仅看总体写压力 |
★★☆☆☆ |
| online-TiKV-Summary |
Store Latency |
判断是否已影响业务 |
★★☆☆☆ |
推荐排查顺序
| 步骤 | Dashboard | 看什么 |
| 1 |
online-Disk-Performance |
磁盘写带宽是否异常高 |
| 2 |
online-TiKV-Details |
Compaction Flow 是否远高于正常写入 |
| 3 |
online-TiKV-Details |
Pending Compaction Bytes 是否持续增长 |
| 4 |
online-TiKV-Details |
是否出现 Write Stall |
| 5 |
online-TiKV-Trouble-Shooting |
是否提示 RocksDB/Compaction 问题 |
| 6 |
online-PD |
是否存在频繁 Region Split |
| 7 |
online-Performance-Write |
是否已经影响事务延迟 |
实战经验判断
| 现象 | 结论 |
| Compaction Bytes 持续远高于 Flush Bytes |
高概率写放大 |
| Pending Compaction Bytes 持续增长 |
Compaction 跟不上 |
| Write Stall > 0 |
写放大已影响业务 |
| L0 Files 长期过高 |
LSM 树压力大 |
| 磁盘写带宽很高但业务 TPS 不高 |
内部写放大明显 |
| Region Split 非常频繁 |
可能间接放大 Compaction |
如果是 TiDB v8.5+,还可以补充关注 TiKV-Details → RocksDB KV 下是否有 Bytes Read/Written by Compaction、Compaction Reasons、Compaction Pending Bytes 等面板,这几个是定位写放大的核心证据。