Ceph的常见问题--requests are blocked
ceph requests are blocked的异常解决办法
问题背景:
ceph环境中常遇到下面的错误
[root@xxx ~]# ceph -s
cluster dc4f91c1-8792-4948-b68f-2fcea75f53b9
health HEALTH_WARN 1 requests are blocked > 32 sec
monmap e3: 5 mons at {xxx-cinder015-128055=240.30.128.55:6789/0,xxx-ceph-cinder017-128057=240.30.128.57:6789/0,xxx-ceph-cinder024-128074=240.30.128.74:6789/0,xxx-ceph-cinder025-128075=240.30.128.75:6789/0,xxx-ceph-cinder026-128076=240.30.128.76:6789/0}, election epoch 216, quorum 0,1,2,3,4 xxx-ceph-cinder015-128055,hh-yun-ceph-cinder017-128057,xxx-ceph-cinder024-128074,xxx-ceph-cinder025-128075,xxx-ceph-cinder026-128076
osdmap e97975: 190 osds: 190 up, 190 in
pgmap v13666786: 20544 pgs, 2 pools, 77479 GB data, 19508 kobjects
228 TB used, 426 TB / 654 TB avail
20542 active+clean
2 active+clean+scrubbing+deep
client io 47657 kB/s rd, 164 MB/s wr, 5406 op/s
1 requests are blocked > 32 sec 有可能是在数据迁移过程中, 用户正在对该数据块进行访问, 但访问还没有完成,数据就迁移到别的 OSD 中, 那么就会导致有请求被 block, 对用户也是有影响的
解决方案:
1、寻找block的请求
(ceph-mon)[root@control01 /]# ceph health detail HEALTH_WARN 2 requests are blocked > 32 sec; 1 osds have slow requests 2 ops are blocked > 4194.3 sec on osd.5 1 osds have slow requests
可以看到osd.5具有一个操作block
2、查找osd对应的主机
例如:
[root@TX-LNSGF-MANAGE-01 ~]# ceph osd find 5
{
"osd": 5,
"ip": "10.64.251.105:6809\/988225",
"crush_location": {
"host": "TX-LNSGF-STORAGE-05",
"root": "default"
}
}
3、重启osd的服务
systemctl start ceph-osd@5
系统会对该 osd 执行 recovery 操作, recovery 过程中, 会断开 block request, 那么这个 request 将会重新请求 mon 节点, 并重新获得新的 pg map, 得到最新的数据访问位置, 从而解决上述问题
4、查看集群状态
(ceph-mon)[root@control01 /]# ceph -s
cluster b233a0b7-4e21-4375-bca8-e215c056cc25
health HEALTH_OK
monmap e1: 3 mons at
{10.254.253.1=10.254.253.1:6789/0,10.254.253.2=10.254.253.2:6789/0,10.254.253.3=10.254.253.3:6789/
election epoch 26, quorum 0,1,2 10.254.253.1,10.254.253.2,10.254.253.3
osdmap e387: 90 osds: 90 up, 90 in
flags sortbitwise,require_jewel_osds
pgmap v1730238: 1008 pgs, 11 pools, 3498 GB data, 886 kobjects
10453 GB used, 235 TB / 245 TB avail
1006 active+clean
2 active+clean+scrubbing+deep
client io 1090 kB/s rd, 92507 kB/s wr, 778 op/s rd, 904 op/s wr

浙公网安备 33010602011771号