步骤1:查看写库的隔离级别

# 查看隔离级别
show variables like '%tx_isolation%'

或者

select @@global.tx_isolation
select @@session.tx_isolation

如果隔离级别为RC,则只有行锁,没有间隙锁。死锁概率会降低很多。

步骤2:查看最近一次的死锁

show engine innodb status

这个命令可以查看很多信息,包括最近一次的死锁日志。

步骤3:死锁日志解读

下面结合具体示例解读死锁日志。

LATEST DETECTED DEADLOCK
------------------------
2023-07-06 01:17:36 140215730644736  
*** (1) TRANSACTION:  // 事务1有关的信息
TRANSACTION 1860, ACTIVE 48 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s), undo log entries 1
MySQL thread id 8, OS thread handle 140216059787008, query id 61 localhost root updating
update user_account set money = 110 where id = 3 // 事务1正在执行的语句

*** (1) HOLDS THE LOCK(S):  // 事务1持有的锁
RECORD LOCKS space id 3 page no 4 n bits 72 index PRIMARY of table `chris`.`user_account` trx id 1860 lock_mode X locks rec but not gap  
// PRIMARY 主键索引,lock_mode X locks rec but not gap,X锁,行锁中的记录锁(不是间隙锁)
Record lock, heap no 5 PHYSICAL RECORD: n_fields 7; compact format; info bits 64
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 000000000744; asc      D;;
 2: len 7; hex 0200000113044d; asc       M;;
 3: len 4; hex 8000006f; asc    o;;
 4: len 4; hex 80000001; asc     ;;
 5: len 4; hex 80000000; asc     ;;
 6: len 4; hex 8000005a; asc    Z;;


*** (1) WAITING FOR THIS LOCK TO BE GRANTED:  // 事务1等待的锁
RECORD LOCKS space id 3 page no 4 n bits 72 index PRIMARY of table `chris`.`user_account` trx id 1860 lock_mode X locks rec but not gap waiting
// PRIMARY 主键锁,lock_mode X locks rec but not gap 代表 记录排他锁,不是间隙锁
Record lock, heap no 6 PHYSICAL RECORD: n_fields 7; compact format; info bits 64
 0: len 4; hex 80000003; asc     ;;
 1: len 6; hex 000000000745; asc      E;;
 2: len 7; hex 010000011503e9; asc        ;;
 3: len 4; hex 8000014d; asc    M;;
 4: len 4; hex 80000002; asc     ;;
 5: len 4; hex 80000000; asc     ;;
 6: len 4; hex 8000006e; asc    n;;


*** (2) TRANSACTION: // 事务2有关的信息
TRANSACTION 1861, ACTIVE 34 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s), undo log entries 1
MySQL thread id 9, OS thread handle 140216058730240, query id 62 localhost root updating
update user_account set money = 90 where id = 1 // 事务2正在执行的语句

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 3 page no 4 n bits 72 index PRIMARY of table `chris`.`user_account` trx id 1861 lock_mode X locks rec but not gap
// PRIMARY 主键锁,lock_mode X locks rec but not gap 代表 记录排他锁,不是间隙锁
Record lock, heap no 6 PHYSICAL RECORD: n_fields 7; compact format; info bits 64
 0: len 4; hex 80000003; asc     ;;
 1: len 6; hex 000000000745; asc      E;;
 2: len 7; hex 010000011503e9; asc        ;;
 3: len 4; hex 8000014d; asc    M;;
 4: len 4; hex 80000002; asc     ;;
 5: len 4; hex 80000000; asc     ;;
 6: len 4; hex 8000006e; asc    n;;


*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 3 page no 4 n bits 72 index PRIMARY of table `chris`.`user_account` trx id 1861 lock_mode X locks rec but not gap waiting
// PRIMARY 主键锁,lock_mode X locks rec but not gap 代表 记录排他锁,不是间隙锁

Record lock, heap no 5 PHYSICAL RECORD: n_fields 7; compact format; info bits 64
 0: len 4; hex 80000001; asc     ;;
 1: len 6; hex 000000000744; asc      D;;
 2: len 7; hex 0200000113044d; asc       M;;
 3: len 4; hex 8000006f; asc    o;;
 4: len 4; hex 80000001; asc     ;;
 5: len 4; hex 80000000; asc     ;;
 6: len 4; hex 8000005a; asc    Z;;

*** WE ROLL BACK TRANSACTION (2)
持有 等待
事务1 id=3主键索引的锁
事务2 id=1主键索引的锁

从日志可以看出,应该是有两个事务相互等待,造成了死锁。最终,回滚了事务2。那这两个事务为什么会相互等待?得结合业务代码来看了。

步骤4:梳理业务代码,梳理SQL执行顺序

根据造成死锁的SQL,我们得梳理代码,看看是哪个业务逻辑。

步骤5:结合应用日志打印,分析原因

对于一些修改操作,我们一般记录有详细的日志。结合日志,可以知道修改的时间和顺序,从而定位到是什么原因导致了死锁。

步骤6:降低死锁概率

在RC隔离级别下,降低死锁的措施有:

  • 尽量不使用大事务,减少锁定时间,降低死锁概率
  • 采用固定的顺序来加锁和释放锁
  • 一个大事务中,将涉及加锁的操作放到最后