MySQL CPU使用彪高,如何快速找到源头

通常在MySQL的日常运维中,会突然发现cpu使用彪高,如何快速找到罪魁祸首呢?通常我们只需要执行show processlist 进行查看,一般执行时间最长的SQL八九不离十就是罪魁祸首,但当show processlist的输出有近千条,那么很难第一眼就发现有问题的SQL,那么如何快速找到呢?其实也非常简单。我们知道mysqld是单进程多线程。那么我们可以使用top获取mysqld进程的各个线程的cpu使用情况。

top -H -p mysqld_pid

我们这里跑一条耗费cpu的SQL来测试一下效果

select * from cpu_h order by rand() limit 1;

 可以看到是线程22682占用cpu比较高,接着通过查看performance_schema.threads表,找到相关SQL:

select * from performance_schema.threads where thread_os_id=22682

 我们再通过show processlist看看

 可以看见找到的没错,我们找到processlist id以后,就可以直接使用命令kill了。

我这里把相关的命令封装成了一个python脚本,代码如下(同时也添加了统计活跃线程SQL执行的次数):

#!/usr/bin/python
# -*- coding:utf-8 -*-
import argparse
import MySQLdb
import argparse
import commands
import sys
import MySQLdb.cursors
from warnings import filterwarnings
from warnings import resetwarnings
filterwarnings('ignore', category = MySQLdb.Warning)
reload(sys)
sys.setdefaultencoding('utf8')

def init_parse():
    parser = argparse.ArgumentParser(
        epilog='by yayun @2022',
    )
    parser.add_argument('-n','--num',required=False,default=1,help='获取多少条最耗费cpu的记录,默认1条')
    parser.add_argument('-a','--active',action='store_true',default=False,help='统计活跃线程各类SQL的条目数')
    return parser

def mysql_exec(sql):
    try:
        conn=MySQLdb.connect(host='127.0.0.1',user='root',passwd='xx',port=3306,connect_timeout=15,charset='utf8')
        curs = conn.cursor()
        curs.execute(sql)
        conn.commit()
        curs.close()
        conn.close()
    except Exception,e:
       print "mysql execute: " + str(e)

def mysql_query(sql):
    conn=MySQLdb.connect(host='127.0.0.1',user='root',passwd='xx',port=3306,connect_timeout=15,charset='utf8',cursorclass = MySQLdb.cursors.DictCursor)
    cursor = conn.cursor()
    count=cursor.execute(sql)
    if count == 0 :
        result=0
    else:
        result=cursor.fetchall()
    return result
    cursor.close()
    conn.close()

if __name__ == '__main__':
    parser = init_parse()
    args = parser.parse_args()
    slow_sql_numbers=args.num
    active_thread_status=args.active
    mysqld_pid = commands.getoutput("cat /data/mysql/3306/pid_mysql.pid")
    get_mysql_thead_cmd="top -H -p %s  -n 1 | grep mysqld | head -n %s | awk '{print $1,$2}' | sed 's/mysql//g'" % (mysqld_pid,slow_sql_numbers)
    tmp_mysqld_thread=commands.getoutput(get_mysql_thead_cmd).split()
    mysqld_thread=[]
    for i in tmp_mysqld_thread:
        try:
             a=i.replace('\x1b[0;10m\x1b[0;10m','')
             a=i.replace('\x1b[0;10m','')
             mysqld_thread.append(int(a))
        except Exception,e:
            pass
    active_thread_sql="select  * from ( select max(user) as user,max(db) as db , max(info) as runningsql,count(*) as rungingsql_cnt from information_schema.processlist where db is not null  and info is not null and info like '%SELECT%' group by user, db, md5(info)   union all     select max(user) as user,max(db) as db ,max(info) as runningsql,count(*) as rungingsql_cnt from information_schema.processlist where db is not null  and info is not null and info like 'UPDATE%' group by user, db,md5(info)  ) a order by 4 desc limit 10"

    if active_thread_status:
        resut=mysql_query(active_thread_sql)
        if resut:
            for i in resut:
                print "运行条目统计: %s | 运行SQL: %s  | 运行用户:%s " % (i['rungingsql_cnt'],i['runningsql'],i['user'])
        print "============================================="    
    processlist_id=[]
    if len(mysqld_thread) >= 2:
        new_mysqld_thread=tuple(mysqld_thread)
        get_slow_sql="select PROCESSLIST_ID,PROCESSLIST_USER,PROCESSLIST_HOST,\
                      PROCESSLIST_DB,PROCESSLIST_TIME,PROCESSLIST_INFO from performance_schema.threads where thread_os_id in %s" % (new_mysqld_thread,)
    else:
        new_mysqld_thread=mysqld_thread
        get_slow_sql="select PROCESSLIST_ID,PROCESSLIST_USER,PROCESSLIST_HOST,\
                      PROCESSLIST_DB,PROCESSLIST_TIME,PROCESSLIST_INFO from performance_schema.threads where thread_os_id=%s" % (new_mysqld_thread[0])
    new_resut=mysql_query(get_slow_sql)
    for b in new_resut:
        if b['PROCESSLIST_ID']:
            processlist_id.append(b['PROCESSLIST_ID'])
        if b['PROCESSLIST_INFO'] != None:
            print "SQL已运行时间: %s秒|执行SQL用户: %s |执行SQL来源ip: %s |操作的库: %s |SQL详情: %s |PROCESSLIST ID: %s" % (b['PROCESSLIST_TIME'],b['PROCESSLIST_USER'],b['PROCESSLIST_HOST'],b['PROCESSLIST_DB'],b['PROCESSLIST_INFO'],b['PROCESSLIST_ID'])

    if processlist_id:
        print "============================================="
        print "以上耗费CPU资源的SQL是否需要杀掉? 请输入Y/N"
        while True:
            in_content = raw_input("请输入:")
            if in_content == "Y":
                for p in processlist_id:
                    sql="kill %s" % (p) 
                    mysql_exec(sql)
                exit(0)
            elif in_content == "N":
                print("退出程序!")
                exit(0)
            else:
                print("输入有误,请重输入!")

运行效果如下:

 同时结合活跃线程各类SQL运行的次数,可以非常快即可定位到问题。

 

posted @ 2022-03-11 08:34  yayun  阅读(256)  评论(0编辑  收藏  举报