MySQL Transaction--MySQL锁升级引发的死锁问题

测试环境

Server version:         5.7.26-29-log Percona Server (GPL)
transaction_isolation:	REPEATABLE-READ

测试数据

/* 1. 表结构 */
CREATE TABLE t  (
    id BIGINT UNSIGNED NOT NULL PRIMARY KEY COMMENT 'id, 无实际意义',
    account_id VARCHAR (64) NOT NULL COMMENT '用户id,不同app下的account_id可能重复',
    type TINYINT UNSIGNED NOT NULL COMMENT '余额类型 1:可用余额',
    balance BIGINT UNSIGNED NOT NULL DEFAULT 0 COMMENT '余额',
    state INT UNSIGNED NOT NULL DEFAULT 1 COMMENT '账户状态 1:NORMAL; 2:FROZE',
    UNIQUE KEY uk_account (account_id, type)
)ENGINE = INNODB DEFAULT CHARSET utf8mb4
COMMENT '测试';

/* 2. 其中 UNIQUE INDEX 为 uk_account(account_id, type) */

/* 3. 插入数据 */
insert into t values(1,'1',1,100,1);
insert into t values(2,'2',1,100,1);
insert into t values(3,'3',1,100,1);
insert into t values(4,'4',1,100,1);
insert into t values(5,'5',1,100,1);

/* 4. 查询所有数据. */
select * from t;

+----+------------+------+---------+-------+
| id | account_id | type | balance | state |
+----+------------+------+---------+-------+
|  1 | 1          |    1 |     100 |     1 |
|  2 | 2          |    1 |     100 |     1 |
|  3 | 3          |    1 |     100 |     1 |
|  4 | 4          |    1 |     100 |     1 |
|  5 | 5          |    1 |     100 |     1 |
+----+------------+------+---------+-------+

测试场景1

会话1开启事务并执行(成功执行):

BEGIN;
SELECT * FROM t 
WHERE account_id = '1' 
AND TYPE =1 
FOR UPDATE;

会话2开启事务并执行(事务被阻塞):

BEGIN;
SELECT * FROM t 
WHERE account_id = '1' 
AND TYPE =1 
FOR UPDATE;

会话3开启事务并执行(事务被阻塞):

BEGIN;
SELECT * FROM t 
WHERE account_id = '1' 
AND TYPE =1 
FOR UPDATE;

查看阻塞事务信息:

## 查看阻塞事务信息
SELECT
p2.`ID` blocked_process_id,
p2.`HOST` blocked_host,
p2.`USER` blocked_user,
r.trx_id bloecked_trx_id,
r.trx_state as blocked_trx_state,
r.trx_started as blocked_trx_started,
TIMESTAMPDIFF(SECOND,r.trx_wait_started,CURRENT_TIMESTAMP) blocked_wait_seconds,
r.trx_query blocked_query,
concat('index: ',l.lock_table,'.',m.`lock_index`,', lock_mode:',m.`lock_type`,',lock_mode:',m.`lock_mode`) as blocked_lock_info,
m.lock_data blocked_lock_data,
p.`ID` blocking_process_id,
p.`HOST` blocking_host,
p.`USER` blocking_user,
b.trx_id blocking_trx_id,
b.trx_state as blocking_trx_state,
b.trx_started as blocking_trx_started,
b.trx_query blocking_query,
IF (p.COMMAND = 'Sleep', CONCAT(p.TIME,' seconds'), 0) blocking_thread_idle_seconds,
CONCAT('kill ',p.`ID`,';') kill_sql
FROM information_schema.INNODB_LOCK_WAITS w
INNER JOIN information_schema.INNODB_TRX b
ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.INNODB_TRX r
ON r.trx_id = w.requesting_trx_id
INNER JOIN information_schema.INNODB_LOCKS l
ON w.blocking_lock_id = l.lock_id
AND l.`lock_trx_id`=b.`trx_id`
INNER JOIN information_schema.INNODB_LOCKS m
ON m.`lock_id`=w.`requested_lock_id`
AND m.`lock_trx_id`=r.`trx_id`
INNER JOIN information_schema.PROCESSLIST p
ON p.ID = b.trx_mysql_thread_id
INNER JOIN information_schema.PROCESSLIST p2
ON p2.ID = r.trx_mysql_thread_id
ORDER BY blocked_wait_seconds DESC ;


*************************** 1. row ***************************
          blocked_process_id: 155692
                blocked_host: 127.0.0.1:52486
                blocked_user: wenjiag.gao
             bloecked_trx_id: 157794
           blocked_trx_state: LOCK WAIT
         blocked_trx_started: 2022-04-07 15:32:38
        blocked_wait_seconds: 40
               blocked_query: SELECT * FROM t  WHERE account_id = '1'  AND TYPE =1  FOR UPDATE
           blocked_lock_info: index: `test_db`.`t`.uk_account, lock_mode:RECORD,lock_mode:X
           blocked_lock_data: '1', 1
         blocking_process_id: 155690
               blocking_host: 127.0.0.1:52384
               blocking_user: wenjiag.gao
             blocking_trx_id: 157793
          blocking_trx_state: RUNNING
        blocking_trx_started: 2022-04-07 15:32:34
              blocking_query: NULL
blocking_thread_idle_seconds: 102 seconds
                    kill_sql: kill 155690;
*************************** 2. row ***************************
          blocked_process_id: 155691
                blocked_host: 127.0.0.1:52480
                blocked_user: wenjiag.gao
             bloecked_trx_id: 157795
           blocked_trx_state: LOCK WAIT
         blocked_trx_started: 2022-04-07 15:32:48
        blocked_wait_seconds: 35
               blocked_query: SELECT * FROM t  WHERE account_id = '1'  AND TYPE =1  FOR UPDATE
           blocked_lock_info: index: `test_db`.`t`.uk_account, lock_mode:RECORD,lock_mode:X
           blocked_lock_data: '1', 1
         blocking_process_id: 155692
               blocking_host: 127.0.0.1:52486
               blocking_user: wenjiag.gao
             blocking_trx_id: 157794
          blocking_trx_state: LOCK WAIT
        blocking_trx_started: 2022-04-07 15:32:38
              blocking_query: SELECT * FROM t  WHERE account_id = '1'  AND TYPE =1  FOR UPDATE
blocking_thread_idle_seconds: 0
                    kill_sql: kill 155692;
*************************** 3. row ***************************
          blocked_process_id: 155691
                blocked_host: 127.0.0.1:52480
                blocked_user: wenjiag.gao
             bloecked_trx_id: 157795
           blocked_trx_state: LOCK WAIT
         blocked_trx_started: 2022-04-07 15:32:48
        blocked_wait_seconds: 35
               blocked_query: SELECT * FROM t  WHERE account_id = '1'  AND TYPE =1  FOR UPDATE
           blocked_lock_info: index: `test_db`.`t`.uk_account, lock_mode:RECORD,lock_mode:X
           blocked_lock_data: '1', 1
         blocking_process_id: 155690
               blocking_host: 127.0.0.1:52384
               blocking_user: wenjiag.gao
             blocking_trx_id: 157793
          blocking_trx_state: RUNNING
        blocking_trx_started: 2022-04-07 15:32:34
              blocking_query: NULL
blocking_thread_idle_seconds: 102 seconds
                    kill_sql: kill 155690;
3 rows in set, 3 warnings (0.00 sec)

查看阻塞锁信息:

## 输出锁信息
SET GLOBAL innodb_status_output_locks = ON;
## 查看事务信息
SHOW ENGINE INNODB STATUS \G

---TRANSACTION 157795, ACTIVE 57 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 155691, OS thread handle 139785549563648, query id 622725 127.0.0.1 wenjiag.gao statistics
SELECT * FROM t  WHERE account_id = '1'  AND TYPE =1  FOR UPDATE
------- TRX HAS BEEN WAITING 4 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157795 lock_mode X locks rec but not gap waiting
------------------
TABLE LOCK table `test_db`.`t` trx id 157795 lock mode IX
RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157795 lock_mode X locks rec but not gap waiting


---TRANSACTION 157794, ACTIVE 67 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 155692, OS thread handle 139785549838080, query id 622724 127.0.0.1 wenjiag.gao statistics
SELECT * FROM t  WHERE account_id = '1'  AND TYPE =1  FOR UPDATE
------- TRX HAS BEEN WAITING 9 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157794 lock_mode X locks rec but not gap waiting
------------------
TABLE LOCK table `test_db`.`t` trx id 157794 lock mode IX
RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157794 lock_mode X locks rec but not gap waiting


---TRANSACTION 157793, ACTIVE 71 sec
3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 155690, OS thread handle 139785550108416, query id 622719 127.0.0.1 wenjiag.gao
TABLE LOCK table `test_db`.`t` trx id 157793 lock mode IX
RECORD LOCKS space id 30 page no 4 n bits 72 index uk_account of table `test_db`.`t` trx id 157793 lock_mode X locks rec but not gap
RECORD LOCKS space id 30 page no 3 n bits 72 index PRIMARY of table `test_db`.`t` trx id 157793 lock_mode X locks rec but not gap

从上面的事务阻塞信息和事务锁信息可得到:

  • 会话1执行请求,获取到唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁,获取到主键索引上(id)为(1)的行锁。
  • 会话2执行请求,被会话1阻塞,等待唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁。
  • 会话3执行请求,被会话1和会话2阻塞,等待唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁。

虽然会话2并未获得唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁(处于等待锁的状态),由于会话3同样申请会话2等待的锁资源且会话3在会话2后执行,因此MySQL判断会话3被会话2阻塞。

测试场景2

会话1开启事务并执行(成功执行):

BEGIN;
SELECT * FROM t 
WHERE account_id = '1' 
AND TYPE =1 
FOR UPDATE;

会话2开始事务执行(事务被阻塞):

BEGIN;
SELECT * FROM t 
WHERE account_id = '1' 
AND TYPE =1 
FOR UPDATE;

会话1继续执行(执行成功):

UPDATE t 
SET state = 2 
WHERE account_id = '1';

会话2出现死锁被回滚:

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transactio

根据上个测试案例的测试结果分析:

  • 会话1执行请求(第一次),获取到唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁,获取到主键索引上(id)为(1)的行锁。
  • 会话2执行请求,被会话1阻塞,等待唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁。
  • 会话1执行请求(第二次),由于条件为account_id = '1',只能使用唯一索引uk_account上(account_id,TYPE)的第一列account_id,且在可重复读事务隔离级别下执行,需要申请唯一索引uk_account上:
    • (account_id,TYPE)为('1',1)记录之前的间隙锁,防止其他事务在该记录前插入account_id = '1'的记录如('1',0)。
    • (account_id,TYPE)为('1',1)记录的行锁,防止其他事务修改该记录。
    • (account_id,TYPE)为('1',1)记录之后的间隙锁,防止其他事务在该记录前插入account_id = '1'的记录如('1',2)。
    • (account_id,TYPE)为('1',1)记录的行锁和该记录后的间隙锁合为Next-Key锁。
  • 会话1的第二次执行申请唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁,而会话2同样申请该行锁且在会话1的第二次操作前执行,因此会话1的第二次执行会被会话2阻塞,触发MySQL死锁检测机制,发现死锁环路:
    • 会话2等待会话1释放唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁。
    • 会话1等待会话2释放唯一索引uk_account上(account_id,TYPE)为('1',1)的行锁。
  • 死锁检测机制挑选会话2作为死锁牺牲者,将其回滚。
  • 会话2回滚后,不再阻塞会话1的第二次执行,会话1申请到锁资源并成功执行。

在MySQL早期版本中,即使事务之前操作已经获取到相应的锁资源,在后续操作如需"更大锁资源"时会尝试申请锁资源而不是立即获得该锁资源,在MySQL 8.0.18修改该问题:

InnoDB: A deadlock was possible when a transaction tries to upgrade a record lock to a next key lock. (Bug #23755664, Bug #82127)

PS: 如果会话1执行的SQL仍是SELECT * FROM t WHERE account_id = '1' AND TYPE =1 FOR UPDATE;,无需锁升级也不会被会话2阻塞。

参考学习

posted @ 2022-04-07 16:28  TeyGao  阅读(213)  评论(0编辑  收藏  举报