tidb高负载问题处理

环境:tidb4

这儿问题是:前段时间执行了大型sql,数据库tikv节点卡死,sql没有释放。

一、查看是否有节点down,手动启动节点
tiup cluster stop tidb -N 172.21.210.36:20160    #停单个节点(注意:这儿的tidb是集群的名称,不是某个主键名称 tidb、tikv、pd)
tiup cluster start tidb -N 172.21.210.36:20160   #启动单个节点
二、对大表进行表分析
Analyze table  rkw_prod.oauth_access_token
三、处理方式登陆每个tidb节点,手动kill长时间执行的僵尸sql
1、先查看tidb集群的tidb节点
[root@host-172-21-210-32 ~]# tiup cluster display tidb
Starting component `cluster`:  display tidb
TiDB Cluster: tidb
TiDB Version: v4.0.4
ID                   Role          Host           Ports        OS/Arch       Status  Data Dir                            Deploy Dir
--                   ----          ----           -----        -------       ------  --------                            ----------
172.21.210.32:9093   alertmanager  172.21.210.32  9093/9094    linux/x86_64  Up      /data1/tidb-data/alertmanager-9093  /data1/tidb-deploy/alertmanager-9093
172.21.210.32:3000   grafana       172.21.210.32  3000         linux/x86_64  Up      -                                   /data1/tidb-deploy/grafana-3000
172.21.210.32:2379   pd            172.21.210.32  2379/2380    linux/x86_64  Up      /data1/tidb-data/pd-2379            /data1/tidb-deploy/pd-2379
172.21.210.33:2379   pd            172.21.210.33  2379/2380    linux/x86_64  Up|L    /data1/tidb-data/pd-2379            /data1/tidb-deploy/pd-2379
172.21.210.37:2379   pd            172.21.210.37  2379/2380    linux/x86_64  Up      /data1/tidb-data/pd-2379            /data1/tidb-deploy/pd-2379
172.21.210.32:9090   prometheus    172.21.210.32  9090         linux/x86_64  Up      /data1/tidb-data/prometheus-9090    /data1/tidb-deploy/prometheus-9090
172.21.210.32:4000   tidb          172.21.210.32  4000/10080   linux/x86_64  Up      -                                   /data1/tidb-deploy/tidb-4000
172.21.210.33:4000   tidb          172.21.210.33  4000/10080   linux/x86_64  Up      -                                   /data1/tidb-deploy/tidb-4000
172.21.210.39:4000   tidb          172.21.210.39  4000/10080   linux/x86_64  Up      -                                   /data1/tidb-deploy/tidb-4000
172.21.210.26:20160  tikv          172.21.210.26  20160/20180  linux/x86_64  Up      /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160
172.21.210.27:20160  tikv          172.21.210.27  20160/20180  linux/x86_64  Up      /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160
172.21.210.35:20160  tikv          172.21.210.35  20160/20180  linux/x86_64  Up      /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160
172.21.210.36:20160  tikv          172.21.210.36  20160/20180  linux/x86_64  Up      /data1/tidb-data/tikv-20160         /data1/tidb-deploy/tikv-20160
2、登陆各个节点生成杀掉长时间的sql命令
mysql> select concat('kill tidb ',id,';') from INFORMATION_SCHEMA.processlist where info is not NULL and time > '1000';
+-----------------------------+
| concat('kill tidb ',id,';') |
+-----------------------------+
| kill tidb 6795433;          |
| kill tidb 6954589;          |
| kill tidb 6753361;          |
| kill tidb 6436120;          |
+-----------------------------+
12 rows in set (0.00 sec)
3、执行kill命令
kill tidb 6795433;
4、等一段时间查看监控恢复

  

  

  

  

  

posted @ 2021-04-16 16:37  苍茫宇宙  阅读(285)  评论(0编辑  收藏  举报