day12-03-Mysql优化-锁
锁的监控及处理
1 锁等待模拟
概念:
Record Lock 记录锁、行级锁、持有哪行的锁
Next Lock 下键锁 (索引键的锁)
GAP Lock 间隙锁 (索引键的锁)
X 排他锁,(增删改操作)
S 共享锁(读锁),不会阻塞其他锁 select过程中才会有
mysql 的select 使用MVCC 的快照读,很少存在加锁的情况
select * from city where countrycode='USA' lock in shard mode;(单独加锁)
意向锁,是表级锁
分类:
意向共享锁(IS):事务想要获取一张表中某几行的共享锁
意向排它锁(IX):事务想要获取一张表中的某几行的排它锁
一旦有大量的锁等待(默认60秒),mysql就会通过回滚事务释放资源。回滚意味着会产生大量IO
tx1:
USE oldboy
UPDATE t_100w SET k1='av' WHERE id=10;
## tx2:
USE oldboy
UPDATE t_100w SET k1='az' WHERE id=10;
锁等待会出现CPU繁忙,CPU占比可能会比较高,
比如%Cpu(s): 3.1 us, 72.3 sy, 0.0 ni, 0.0 id, 23.6 wa
这就要怀疑锁等待情况了
监控锁状态
-
查看锁等待******
mysql> show status like 'innodb_row_lock%';
+-------------------------------+-----------+
| Variable_name | Value |
+-------------------------------+-----------+
| Innodb_row_lock_current_waits | 30 |
| Innodb_row_lock_time | 301845756 |
| Innodb_row_lock_time_avg | 141 |
| Innodb_row_lock_time_max | 121822 |
| Innodb_row_lock_waits | 2127701 |
+-------------------------------+-----------+
5 rows in set (0.01 sec)
# 主要关注
Innodb_row_lock_current_waits #当前有多少锁等待
Innodb_row_lock_waits #一共发生过多少个
-
查看哪个事务在等待(被阻塞了)
use information_schema;
select trx_id,trx_state,trx_mysql_thread_id,trx_query from information_schema.INNODB_TRX where trx_state='LOCK WAIT';
mysql> select trx_id,trx_state,trx_mysql_thread_id,trx_query from information_schema.INNODB_TRX where trx_state='LOCK WAIT';
+--------+-----------+---------------------+-------------------------------------+
| trx_id | trx_state | trx_mysql_thread_id | trx_query |
+--------+-----------+---------------------+-------------------------------------+
| 2908 | LOCK WAIT | 10785 | update t3 set name='aaa' where id=1 |
+--------+-----------+---------------------+-------------------------------------+
1 row in set (0.00 sec)
mysql>
trx_id: 被阻塞的事务ID号(引擎层)
trx_state: 当前事务状态
trx_mysql_thread_id: 连接层的,连接线程ID(show processlist --> id 或者 trx_id) ***重点关注
trx_query: 当前被阻塞的操作(一般是要丢给开发的)
-
查看锁源(谁锁的我)******
select * from sys.innodb_lock_waits; ## ====>被锁的和锁定它的之间关系
locked_table:哪张表出现的锁等待
locked_index:在哪个索引的锁
locked_type:锁的类型 (RECORD,GAP,NEXT LOCK)
waiting_trx_id: 等待的事务ID(与上个视图trx_id对应)
waiting_pid:等待事务的链接的线程ID (连接层的,连接线程ID(show processlist ---> id 或者 trx_id) ****重点关注
waiting_query:等待事务语句
waiting_lock_mode:等待锁的类型(X,S)
blocking_trx_id:锁源的事务ID
blocking_pid:锁源的事务链接线程ID
sql_kill_blocking_connection:处理建议
mysql> select locked_table,locked_index,locked_type,waiting_trx_id,waiting_pid,waiting_query,waiting_lock_mode,blocking_trx_id,blocking_pid,sql_kill_blocking_connection from sys.innodb_lock_waits;
+---------------+--------------+-------------+----------------+-------------+-------------------------------------+-------------------+-----------------+--------------+------------------------------+
| locked_table | locked_index | locked_type | waiting_trx_id | waiting_pid | waiting_query | waiting_lock_mode | blocking_trx_id | blocking_pid | sql_kill_blocking_connection |
+---------------+--------------+-------------+----------------+-------------+-------------------------------------+-------------------+-----------------+--------------+------------------------------+
| `taobao`.`t3` | PRIMARY | RECORD | 2909 | 10785 | update t3 set name='aaa' where id=1 | X | 2907 | 10756 | KILL 10756 |
+---------------+--------------+-------------+----------------+-------------+-------------------------------------+-------------------+-----------------+--------------+------------------------------+
1 row in set, 3 warnings (0.00 sec)
mysql>
-
根据锁源的pid,找到锁源的线程ID(thread_id)******
select * from performance_schema.threads
where processlist_id = 13825;
===> 41
performance_schema.threads 下需要关注的字段:
processlist_id # 锁源的事务链接PID (show processlist)
thread_id # PID分配负责执行SQL语句的线程ID,与PID是父子关系
NAME # 线程名称
processlist_id = sys.innodb_lock_waits.blocking_pid
mysql> select thread_id,NAME,processlist_id from performance_schema.threads where processlist_id=13825;
+-----------+---------------------------+----------------+
| thread_id | NAME | processlist_id |
+-----------+---------------------------+----------------+
| 13850 | thread/sql/one_connection | 13825 |
+-----------+---------------------------+----------------+
1 row in set (0.00 sec)
mysql>
# 查到 负责执行SQL的线程ID(thread_id)是 13850
-
找到锁源的SQL语句******
performance_schema.events_statements_current 下需要关注的字段: thread_id #执行SQL的线程ID event_name #事件名称 lock_time #锁定时间 sql_text #线程执行的SQL语句内容-
当前在执行的语句
select thread_id,event_name, lock_time,sql_text from performance_schema.events_statements_current where thread_id=13850; mysql> select -> thread_id,event_name, -> lock_time,sql_text -> from performance_schema.events_statements_current -> where thread_id=13850; +-----------+----------------------+-----------+------------------------------------+ | thread_id | event_name | lock_time | sql_text | +-----------+----------------------+-----------+------------------------------------+ | 13850 | statement/sql/update | 137000000 | update t3 set name='aa' where id=1 | +-----------+----------------------+-----------+------------------------------------+ 1 row in set (0.00 sec) mysql> -
执行语句的历史
select thread_id,event_name, lock_time,sql_text from performance_schema.events_statements_history where thread_id=13850; mysql> select -> thread_id,event_name, -> lock_time,sql_text -> from performance_schema.events_statements_history -> where thread_id=13850; +-----------+------------------------------+-----------+------------------------------------+ | thread_id | event_name | lock_time | sql_text | +-----------+------------------------------+-----------+------------------------------------+ | 13850 | statement/sql/update | 137000000 | update t3 set name='aa' where id=1 | | 13850 | statement/sql/select | 0 | SELECT DATABASE() | | 13850 | statement/com/Init DB | 0 | NULL | | 13850 | statement/sql/show_databases | 100000000 | show databases | | 13850 | statement/sql/show_tables | 55000000 | show tables | | 13850 | statement/com/Field List | 0 | NULL | | 13850 | statement/com/Field List | 0 | NULL | | 13850 | statement/com/Field List | 0 | NULL | | 13850 | statement/com/Field List | 0 | NULL | | 13850 | statement/sql/begin | 0 | begin | +-----------+------------------------------+-----------+------------------------------------+ 10 rows in set (0.00 sec) mysql> mysql>得出结果,丢给开发 表信息 被阻塞的 锁源SQL 练习: 一键获得以上信息,请写出具体的SQL语句
-
-
优化项目:锁的监控及处理
1. 背景: 硬件环境: DELL R720,E系列16核,48G MEM,SAS*900G*6,RAID10 在例行巡检时,发现9-11点时间段的CPU压力非常高(80-90%) 2. 项目的职责 2.1 通过top详细排查,发现mysqld进程占比达到了700-800% 2.2 其中有量的CPU是被用作的SYS和WAIT,us处于正常 2.3 怀疑是MySQL 锁 或者SQL语句出了问题 2.4 经过排查slowlog及锁等待情况,发现有大量锁等待及少量慢语句 (1) pt-query-diagest 查看慢日志 (2) 锁等待有没有? db03 [(none)]>show status like 'innodb_row_lock%'; +-------------------------------+-------+ | Variable_name | Value | +-------------------------------+-------+ | Innodb_row_lock_current_waits | 0 | | Innodb_row_lock_time | 0 | | Innodb_row_lock_time_avg | 0 | | Innodb_row_lock_time_max | 0 | | Innodb_row_lock_waits | 0 | +-------------------------------+-------+ 情况一: 有100多个current_waits,说明当前很多锁等待情况 情况二: 1000多个lock_waits,说明历史上发生过的锁等待很多 2.5 查看那个事务在等待(被阻塞了) 2.6 查看锁源事务信息(谁锁的我) 2.7 找到锁源的thread_id 2.8 找到锁源的SQL语句 3. 找到语句之后,和应用开发人员进行协商 (1) 开发人员描述,此语句是事务挂起导致 我们提出建议是临时kill 会话,最终解决问题 (2) 开发人员查看后,发现是业务逻辑问题导致的死锁,产生了大量锁等待 临时解决方案,将阻塞事务的会话kill掉. 最终解决方案,修改代码中的业务逻辑 项目结果: 经过排查处理,锁等待的个数减少80%.解决了CPU持续峰值的问题. 锁监控设计到的命令: show status like 'innodb_rows_lock%' select * from information_schema.innodb_trx; select * from sys.innodb_lock_waits; select * from performance_schema.threads; select * from performance_schema.events_statements_current; select * from performance_schema.events_statements_history; -
死锁监控
# 查看死锁监控 show engine innodb status\G # 查看死锁配置参数 mysql> show variables like '%deadlock%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | innodb_deadlock_detect | ON | | innodb_print_all_deadlocks | OFF | +----------------------------+-------+ 2 rows in set (0.00 sec) mysql> # 还可以改配置文件,把死锁信息输入到错误日志里。 vim /etc/my.cnf innodb_print_all_deadlocks = 1 -
主从优化
## 5.7 从库多线程复制 Multi-threaded slave MTS 基本要求: 5.7以上的版本(忘记小版本) 必须开启GTID binlog必须是row模式 # 开启GTID gtid_mode=ON enforce_gtid_consistency=ON log_slave_updates=ON # Slave Muliti SQL_Thread # LOGICAL_CLOCK 本地时钟,基于事务级别的SQL并发回放,必须开启GTID mysql> show variables like 'slave_parallel_type'; +---------------------+----------+ | Variable_name | Value | +---------------------+----------+ | slave_parallel_type | DATABASE | +---------------------+----------+ 1 row in set (0.00 sec) mysql> select @@slave_parallel_type; +-----------------------+ | @@slave_parallel_type | +-----------------------+ | DATABASE | +-----------------------+ 1 row in set (0.00 sec) mysql> # slave_parallel_type 默认是 DATABASE,必须改成LOGICAL_CLOCK slave-parallel-type=LOGICAL_CLOCK slave-parallel-workers=16 #并发回放的线程数,一般是CPU核心个数的50% # master_info 和 relay_log_info 改成以表的方式存储,提高性能。默认是文件形式存储 master_info_repository=TABLE relay_log_info_repository=TABLE # 开启relay_log 恢复功能(binlog出现问题,可以通过relog做一些问题恢复) relay_log_recovery=ON 5.7 : slave-parallel-type=LOGICAL_CLOCK slave-parallel-workers=8 cpu核心数作为标准 CHANGE MASTER TO MASTER_HOST='10.0.0.128', MASTER_USER='repl', MASTER_PASSWORD='123', MASTER_PORT=3307, MASTER_AUTO_POSITION=1; start slave;

浙公网安备 33010602011771号