【ceph相关】krbd删除速度过慢
一、背景说明
1、问题描述
创建4T大小krbd(未写入任何数据),删除块耗时将近二十多分钟
2、问题排查
2.1、排查思路
默认rbd特性:layering, exclusive-lock, object-map, fast-diff, deep-flatten
[root@node117 ~]# rbd info rbd/lun02
rbd image 'lun02':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.b5cd36b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Tue Dec 22 10:31:06 2020
krbd特性:layering, exclusive-lock
当前系统内核版本过低,不支持deep-flatten、fast-diff、object-map特性,执行rbd map操作时,报错信息RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable lun02 object-map fast-diff deep-flatten".,故krbd需要关闭这三个特性
[root@node117 ~]# uname -a
Linux node117 4.14.113-1.el7.x86_64 #1 SMP Wed Sep 9 17:22:41 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@node117 ~]# rbd info rbd/lun03
rbd image 'lun03':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.b759e6b8b4567
format: 2
features: layering, exclusive-lock
flags:
create_timestamp: Tue Dec 22 10:31:22 2020
4T大小的默认rbd可以在1s内完成删除,依次关闭deep-flatten、fast-diff、object-map比对测试,发现在关闭object-map特性后,删除耗时增加二十多分钟
-关闭deep-flatten特性,删除耗时0.616s
[root@node117 ~]# rbd feature disable rbd/lun003 deep-flatten
[root@node117 ~]# rbd info rbd/lun003
rbd image 'lun003':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.4006d6b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff
flags:
create_timestamp: Mon Dec 21 18:20:15 2020
[root@node117 ~]# time rbd rm rbd/lun003
Removing image: 100% complete...done.
real 0m0.616s
user 0m0.393s
sys 0m0.013s
-关闭deep-flatten、fast-diff特性,删除耗时0.659s
[root@node117 ~]# rbd feature disable rbd/lun004 deep-flatten,fast-diff
[root@node117 ~]# rbd info rbd/lun004
rbd image 'lun004':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.416d66b8b4567
format: 2
features: layering, exclusive-lock, object-map
flags:
create_timestamp: Mon Dec 21 18:26:41 2020
[root@node117 ~]# time rbd rm rbd/lun004
Removing image: 100% complete...done.
real 0m0.659s
user 0m0.398s
sys 0m0.017s
-关闭deep-flatten、fast-diff、object-map特性,删除耗时21m36.884s
[root@node117 ~]# rbd feature disable rbd/lun005 deep-flatten,fast-diff,object-map
[root@node117 ~]# rbd info rbd/lun005
rbd image 'lun005':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.41f676b8b4567
format: 2
features: layering, exclusive-lock
flags:
create_timestamp: Mon Dec 21 18:31:12 2020
[root@node117 ~]# time rbd rm rbd/lun005
Removing image: 100% complete...done.
real 21m36.884s
user 1m20.757s
sys 0m48.640s
2.2、根因分析
块对象分配为thin-provisioning(自动精简配置),创建块时,并不会占真实的容量,只有当块写入数据时,才会根据写入数据量分配对应的对象数(默认4MB一个对象)
- 当开启object-map特性时,会记录块所有对象的一个位图,用以标记对象是否真的存在,未写入数据的块不包含任何对象,直接回收块,故删除速度很快
- 当关闭object-map特性时,ceph无法得知该块是否真实写入数据(4M一个对象,4T块有1048576 个对象),在执行删除操作时,仍会向rados发起请求,一个个对象进行删除(即便该块不存在任何对象数据),故删除速度很慢
[root@node117 ~]# rbd create rbd/pp --size 40000MB
[root@node117 ~]# rbd info rbd/pp
rbd image 'pp':
size 39.1GiB in 10000 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.be1136b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Tue Dec 22 11:22:49 2020
[root@node117 ~]# rados -p rbd ls | grep rbd_data.be1136b8b4567 | wc -l
0
二、解决方法
根据一中的根因分析,需要升级内核版本以支持object-map, fast-diff, deep-flatten特性,示例升级内核版本至5.10.2
1、内核升级
- 下载5.10.2内核版本包
kernel-ml-devel-5.10.2-1.el7.elrepo.x86_64.rpm
kernel-ml-5.10.2-1.el7.elrepo.x86_64.rpm - 安装5.10.2内核版本包
[root@node147 kernel5.10.2]# rpm -ivh kernel-ml-5.10.2-1.el7.elrepo.x86_64.rpm kernel-ml-devel-5.10.2-1.el7.elrepo.x86_64.rpm
- 设置默认启动项
[root@node147 kernel5.10.2]# awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
0 : CentOS Linux (5.10.2-1.el7.elrepo.x86_64) 7 (Core)
1 : CentOS Linux (4.14.113-1.el7.x86_64) 7 (Core)
2 : CentOS Linux (4.4.248-1.el7.elrepo.x86_64) 7 (Core)
3 : CentOS Linux (3.10.0-957.el7.x86_64) 7 (Core)
4 : CentOS Linux (0-rescue-ea5fd2d87aff411a84e213a00b214bf2) 7 (Core)
[root@node147 kernel5.10.2]# grub2-set-default "CentOS Linux (5.10.2-1.el7.elrepo.x86_64) 7 (Core)"
2、测试验证
- 未升级内核版本前,rbd映射失败,提示内核不支持object-map fast-diff deep-flatten特性
[root@node147 ~]# rbd create rbd/lun001 --size 4096G
[root@node147 ~]# rbd info rbd/lun001
rbd image 'lun001':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.40dcf76b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Tue Dec 22 07:51:02 2020
[root@node147 ~]# rbd map rbd/lun001
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable lun001 object-map fast-diff deep-flatten".
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (6) No such device or address
[root@node147 ~]# uname -a
Linux node147 4.14.113-1.el7.x86_64 #1 SMP Wed Sep 9 17:22:41 CST 2020 x86_64 x86_64 x86_64 GNU/Linux
- 升级内核版本后,rbd映射成功,且可以很快删除完rbd
[root@node147 ~]# rbd info rbd/lun001
rbd image 'lun001':
size 4TiB in 1048576 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.40dcf76b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Tue Dec 22 07:51:02 2020
[root@node147 ~]# rbd map rbd/lun001
/dev/rbd0
[root@node147 ~]# uname -a
Linux node147 5.10.2-1.el7.elrepo.x86_64 #1 SMP Sun Dec 20 09:53:23 EST 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@node147 ~]# time rbd rm rbd/lun001
Removing image: 100% complete...done.
real 0m12.197s
user 0m0.599s
sys 0m0.028s
三、扩展补充
1、rbd特性说明
rbd默认启用layering, exclusive-lock, object-map, fast-diff, deep-flatten特性
[root@node117 ~]# ceph --show-config | grep rbd_default_feature
rbd_default_features = 61
| 属性 | 功能 | BIT码 | 备注 |
|---|---|---|---|
| layering | 支持分层 | 1 | image的克隆操作。可以对image创建快照并保护,然后从快照克隆出新的image出来,父子image之间采用COW技术,共享对象数据 |
| striping | 支持条带化 v2 | 2 | 条带化对象数据,类似raid 0,可改善顺序读写场景较多情况下的性能 |
| exclusive-lock | 支持独占锁 | 4 | 保护image数据一致性,对image做修改时,需要持有此锁。这个可以看做是一个分布式锁,在开启的时候,确保只有一个客户端在访问image, 否则锁的竞争会导致io急剧下降。 主要应用场景是qemu live-migration |
| object-map | 支持对象映射(依赖 exclusive-lock ) | 8 | 此特性依赖于exclusive lock。因为image的对象分配是thin-provisioning,此特性开启的时候,会记录image所有对象的一个位图, 用以标记对象是否真的存在,在一些场景下可以加速io |
| fast-diff | 快速计算差异(依赖 object-map ) | 16 | 此特性依赖于object map和exlcusive lock。快速比较image的snapshot之间的差异 |
| deep-flatten | 支持快照扁平化操作 | 32 | layering特性使得克隆image的时候,父子image之间采用COW,他们之间的对象文件存在依赖关系,flatten操作的目的是解除父子image的依赖关系, 但是子image的快照并没有解除依赖,deep-flatten特性使得快照的依赖也解除 |
| journaling | 支持记录 IO 操作(依赖独占锁) | 64 | 依赖于exclusive lock。将image的所有修改操作进行日志化,并且复制到另外一个集群(mirror),可以做到块存储的异地灾备。 这个特性在部署的时候需要新部署一个daemon进程,目前还在试验阶段,不过这个特性很重要,可以做跨集群/机房容灾 |
<wiz_tmp_tag id="wiz-table-range-border" contenteditable="false" style="display: none;">

浙公网安备 33010602011771号