代码改变世界

并发update导致的死锁一案例及优化建议

2018-05-29 11:35  Kevin.hhl  阅读(5195)  评论(0)    收藏  举报

一、死锁信息

 1.1  数据库死锁信息

------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-05-26 09:14:53 2b3617c5a700
*** (1) TRANSACTION:
TRANSACTION 2615948256, ACTIVE 0.015 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 181 lock struct(s), heap size 30248, 2 row lock(s)
LOCK BLOCKING MySQL thread id: 34075661 block 50571115
MySQL thread id 50571115, OS thread handle 0x2b3658df6700, query id 14020145220 172.24.17.194 app_redcliffc Searching rows for update
UPDATE `order` SET rider_id = 489271, status = 15, dispatch_tm = '2018-05-26 09:14:53.343', arrive_tm = '2018-05-26 09:14:53.343', accept_tm = '2018-05-26 09:14:53.343', leave_tm = '2018-05-26 09:14:53.343', arriveable_tm = '2018-05-26 09:14:53.343', real_rider_type = 0, rider_region_id = 11, feature = 'PDRTm:1527297293319-1527357233319|PDSTm:1527297293319-1527899033319|directDistance:79|firstMode:5|mixMc:5|walkDistance:75' WHERE status = 0 AND feature = 'DRTm:1527296793065-1527297993065|DSTm:1527296793065-1533296733065|FDTm:1527296793065|ISRR:0|ISRS:0|WT:1|cncpType:2|firstMode:2|gDisMax:99999|ptdpfTm:1533296733065'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 136 page no 107202 n bits 88 index `PRIMARY` of table `redcliff`.`order` trx id 2615948256 lock_mode X locks rec but not gap waiting
Record lock, heap no 18 PHYSICAL RECORD: n_fields 76; compact format; info bits 0
 0: len 8; hex 00000000235af282; asc     #Z  ;;
 1: len 6; hex 00009bec33e4; asc     3 ;;
 2: len 7; hex 2100003ff3214a; asc !  ? !J;;
 3: len 4; hex 8000002f; asc    /;;
 4: len 9; hex 353933313632383832; asc 593162882;;
 5: len 4; hex 801e4085; asc   @ ;;
 6: len 26; hex e69dade5b79ee5b08fe7acbce58c8528e9a5bfe4ba86e4b98829; asc                (         );;
 7: len 4; hex 80000012; asc     ;;
 8: len 4; hex 80078e46; asc    F;;
 9: len 1; hex 80; asc  ;;
 10: len 1; hex 85; asc  ;;
 11: len 30; hex e8bebde5ae81e79c81e5a4a7e8bf9ee7bb8fe6b58ee68a80e69cafe5bc80; asc                               ; (total 80 bytes);
 12: len 4; hex 8741cd1c; asc  A  ;;
 13: len 4; hex 8253c580; asc  S  ;;
 14: len 11; hex 3133303635373838343838; asc 13065788488;;
 15: len 30; hex e696b0e5b88ce69c9be4b990e59f8ee6b389e8b7af393639e58fb7203131; asc                      969    11; (total 40 bytes);
 16: len 4; hex 874199c8; asc  A  ;;
 17: len 4; hex 8253f3bc; asc  S  ;;
 18: len 11; hex 3133353931383030333531; asc 13591800351;;
 19: len 17; hex e99988e685a7e5ad9028e5a5b3e5a3ab29; asc          (      );;
 20: len 4; hex 8000099b; asc     ;;
 21: len 5; hex 999ff493b2; asc      ;;
 22: len 5; hex 999ff493b2; asc      ;;
 23: len 5; hex 999ff493b5; asc      ;;
 24: SQL NULL;
 25: len 5; hex 999ff493b5; asc      ;;
 26: SQL NULL;
 27: SQL NULL;
 28: SQL NULL;
 29: len 5; hex 999ff496f5; asc      ;;
 30: SQL NULL;
 31: len 5; hex 999ff493b2; asc      ;;
 32: len 4; hex 80000000; asc     ;;
 33: SQL NULL;
 34: len 1; hex 81; asc  ;;
 35: len 1; hex 80; asc  ;;
 36: len 1; hex 80; asc  ;;
 37: len 1; hex 03; asc  ;;
 38: SQL NULL;
 39: SQL NULL;
 40: len 1; hex 81; asc  ;;
 41: len 19; hex 31323030303230323631363537363434333232; asc 1200020261657644322;;
 42: len 0; hex ; asc ;;
 43: len 2; hex 3233; asc 23;;
 44: len 19; hex 31323230333530353934313231343638393835; asc 1220350594121468985;;
 45: SQL NULL;
 46: len 4; hex 800006d5; asc     ;;
 47: len 4; hex 800006d5; asc     ;;
 48: len 1; hex 80; asc  ;;
 49: len 30; hex e6b8a0e98193e8aea2e58d95e58fb7efbc9a313230303032303236313635; asc                   120002026165; (total 37 bytes);
 50: len 1; hex 80; asc  ;;
 51: len 1; hex 80; asc  ;;
 52: len 0; hex ; asc ;;
 53: len 5; hex 8000000000; asc      ;;
 54: len 1; hex 80; asc  ;;
 55: len 1; hex da; asc  ;;
 56: SQL NULL;
 57: len 1; hex 86; asc  ;;
 58: len 30; hex 434453546d3a313532373239373238393631332d31353237323939393839; asc CDSTm:1527297289613-1527299989; (total 554 bytes);
 59: len 4; hex 00000318; asc     ;;
 60: SQL NULL;
 61: SQL NULL;
 62: SQL NULL;
 63: len 1; hex 80; asc  ;;
 64: len 2; hex 3030; asc 00;;
 65: len 4; hex 80000003; asc     ;;
 66: SQL NULL;
 67: len 0; hex ; asc ;;
 68: len 0; hex ; asc ;;
 69: len 15; hex 302c352c31302c31352c32302c3430; asc 0,5,10,15,20,40;;
 70: len 0; hex ; asc ;;
 71: SQL NULL;
 72: SQL NULL;
 73: SQL NULL;
 74: len 4; hex 8000fd1b; asc     ;;
 75: SQL NULL;
*** (2) TRANSACTION:
TRANSACTION 2615948260, ACTIVE 0.003 sec updating or deleting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1184, 2 row lock(s), undo log entries 1
MySQL thread id 34075661, OS thread handle 0x2b3617c5a700, query id 14020145268 172.24.16.68 app_redcliffc updating
UPDATE `order` SET rider_id = 495174, status = 5, dispatch_tm = '2018-05-26 09:14:53.363', accept_tm = '2018-05-26 09:14:53.363', arriveable_tm = '2018-05-26 09:27:53.316', real_rider_type = 0, rider_region_id = 1749, over_time = 90, feature = 'CDSTm:1527297289613-1527299989000|DRTm:1527297235000-1527298489625|DSTm:1527297289625-1527299989000|EGC:7398|FDTm:1527297289625|IPDSTm:1527297293283-1527299731925|ISRR:0|ISRS:0|PDRTm:1527297293283-1527298073316|PDSTm:1527297293283-1527299731925|UOCT:1527296682000|WT:1|directDistance:22|dispatchMc:3|dynamicTime:1|eleSw:0|elmDis:2255|elmSc:1|elmTdpf:1527299989000|elmWT:CLEAR_DAY|fft:1527299608610|firstMode:6|gDisMax:2500|isEleDemotion:0|iscomp:0|mixMc:1|pft:1527297235000|poct:1527297289000|pp:0|roadDegree:1.0|ron:1|upTm:1527296689000|walkDistance:193' WHERE status = 0 AND id = 593162882 AND feature = 'CDSTm:1527297289613-1527299989000|DRTm:1527297235000-15
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 136 page no 107202 n bits 88 index `PRIMARY` of table `redcliff`.`order` trx id 2615948260 lock_mode X locks rec but not gap
Record lock, heap no 18 PHYSICAL RECORD: n_fields 76; compact format; info bits 0
 0: len 8; hex 00000000235af282; asc     #Z  ;;
 1: len 6; hex 00009bec33e4; asc     3 ;;
 2: len 7; hex 2100003ff3214a; asc !  ? !J;;
 3: len 4; hex 8000002f; asc    /;;
 4: len 9; hex 353933313632383832; asc 593162882;;
 5: len 4; hex 801e4085; asc   @ ;;
 6: len 26; hex e69dade5b79ee5b08fe7acbce58c8528e9a5bfe4ba86e4b98829; asc                (         );;
 7: len 4; hex 80000012; asc     ;;
 8: len 4; hex 80078e46; asc    F;;
 9: len 1; hex 80; asc  ;;
 10: len 1; hex 85; asc  ;;
 11: len 30; hex e8bebde5ae81e79c81e5a4a7e8bf9ee7bb8fe6b58ee68a80e69cafe5bc80; asc                               ; (total 80 bytes);
 12: len 4; hex 8741cd1c; asc  A  ;;
 13: len 4; hex 8253c580; asc  S  ;;
 14: len 11; hex 3133303635373838343838; asc 13065788488;;
 15: len 30; hex e696b0e5b88ce69c9be4b990e59f8ee6b389e8b7af393639e58fb7203131; asc                      969    11; (total 40 bytes);
 16: len 4; hex 874199c8; asc  A  ;;
 17: len 4; hex 8253f3bc; asc  S  ;;
 18: len 11; hex 3133353931383030333531; asc 13591800351;;
 19: len 17; hex e99988e685a7e5ad9028e5a5b3e5a3ab29; asc          (      );;
 20: len 4; hex 8000099b; asc     ;;
 21: len 5; hex 999ff493b2; asc      ;;
 22: len 5; hex 999ff493b2; asc      ;;
 23: len 5; hex 999ff493b5; asc      ;;
 24: SQL NULL;
 25: len 5; hex 999ff493b5; asc      ;;
 26: SQL NULL;
 27: SQL NULL;
 28: SQL NULL;
 29: len 5; hex 999ff496f5; asc      ;;
 30: SQL NULL;
 31: len 5; hex 999ff493b2; asc      ;;
 32: len 4; hex 80000000; asc     ;;
 33: SQL NULL;
 34: len 1; hex 81; asc  ;;
 35: len 1; hex 80; asc  ;;
 36: len 1; hex 80; asc  ;;
 37: len 1; hex 03; asc  ;;
 38: SQL NULL;
 39: SQL NULL;
 40: len 1; hex 81; asc  ;;
 41: len 19; hex 31323030303230323631363537363434333232; asc 1200020261657644322;;
 42: len 0; hex ; asc ;;
 43: len 2; hex 3233; asc 23;;
 44: len 19; hex 31323230333530353934313231343638393835; asc 1220350594121468985;;
 45: SQL NULL;
 46: len 4; hex 800006d5; asc     ;;
 47: len 4; hex 800006d5; asc     ;;
 48: len 1; hex 80; asc  ;;
 49: len 30; hex e6b8a0e98193e8aea2e58d95e58fb7efbc9a313230303032303236313635; asc                   120002026165; (total 37 bytes);
 50: len 1; hex 80; asc  ;;
 51: len 1; hex 80; asc  ;;
 52: len 0; hex ; asc ;;
 53: len 5; hex 8000000000; asc      ;;
 54: len 1; hex 80; asc  ;;
 55: len 1; hex da; asc  ;;
 56: SQL NULL;
 57: len 1; hex 86; asc  ;;
 58: len 30; hex 434453546d3a313532373239373238393631332d31353237323939393839; asc CDSTm:1527297289613-1527299989; (total 554 bytes);
 59: len 4; hex 00000318; asc     ;;
 60: SQL NULL;
 61: SQL NULL;
 62: SQL NULL;
 63: len 1; hex 80; asc  ;;
 64: len 2; hex 3030; asc 00;;
 65: len 4; hex 80000003; asc     ;;
 66: SQL NULL;
 67: len 0; hex ; asc ;;
 68: len 0; hex ; asc ;;
 69: len 15; hex 302c352c31302c31352c32302c3430; asc 0,5,10,15,20,40;;
 70: len 0; hex ; asc ;;
 71: SQL NULL;
 72: SQL NULL;
 73: SQL NULL;
 74: len 4; hex 8000fd1b; asc     ;;
 75: SQL NULL;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 136 page no 105816 n bits 632 index `idx_status_cityid_requiretm_dispatchabletm` of table `redcliff`.`order` trx id 2615948260 lock_mode X locks rec but not gap waiting
Record lock, heap no 527 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
 0: len 1; hex 80; asc  ;;
 1: len 4; hex 8000002f; asc    /;;
 2: SQL NULL;
 3: len 5; hex 999ff493b2; asc      ;;
 4: len 8; hex 00000000235af282; asc     #Z  ;;
*** WE ROLL BACK TRANSACTION (2)

1.2 业务死锁报的信息

看到数据库层面看到的死锁只是产生死锁的sql,具体的事务包含的sql需要找业务获取。而且通过show engine innodb status 看到的sql可能是不完整的,事务2的sql看不到shardx=? 

二、死锁分析

2.1 信息提供

事务隔离级别:RC

order表的索引:

PRIMARY KEY (`id`),
UNIQUE KEY `uniq_platform_shopid_channel_id` (`platform_shopid`,`channel_id`),
KEY `idx_channel_id` (`channel_id`),
KEY `idx_finish_tm` (`finish_tm`),
KEY `idx_place_tm` (`place_tm`),
KEY `idx_cityid_shardx_placetm` (`city_id`,`shardx`,`place_tm`),
KEY `idx_city_dispatchtm` (`city_id`,`dispatch_tm`),
KEY `idx_shopid_cityid` (`shop_id`,`city_id`),
KEY `idx_shardx_shoptitle` (`shardx`,`shop_title`),
KEY `outter_group` (`outter_group`),
KEY `idx_cityid_shardx_shopregionid` (`city_id`,`shardx`,`shop_region_id`),
KEY `idx_cityid_riderid` (`city_id`,`rider_id`,`status`),
KEY `idx_cityid_status_dispatchable_tm` (`city_id`,`status`,`dispatchable_tm`,`platform_shopid`),
KEY `idx_riderid_status_finishtm` (`rider_id`,`status`,`finish_tm`),
KEY `idx_status_dispatchtm` (`status`,`dispatch_tm`,`platform_shopid`),
KEY `idx_status_leavetm` (`status`,`leave_tm`,`platform_shopid`),
KEY `idx_status_arrivetm` (`status`,`arrive_tm`,`platform_shopid`),
KEY `idx_cityid_platformshopid_placetm` (`city_id`,`platform_shopid`,`place_tm`),
KEY `idx_tradeid_placetm` (`trade_id`(20),`place_tm`),
KEY `idx_fromtel_placetm` (`from_tel`(20),`place_tm`),
KEY `idx_cityid_status_placetm` (`city_id`,`status`,`place_tm`),
KEY `idx_dispatch_mode_status_cityid_dispatchable_tm` (`dispatch_mode`,`status`,`city_id`,`dispatchable_tm`),
KEY `idx_totel_placetm` (`to_tel`(20),`place_tm`),
KEY `idx_status_cityid_requiretm_dispatchabletm` (`status`,`city_id`,`require_tm`,`dispatchable_tm`),
KEY `idx_riderid_shardx_status` (`rider_id`,`shardx`,`status`),
KEY `idx_cityid_shardx_waybill_no_placetm` (`city_id`,`shardx`,`waybill_no`,`place_tm`)

2.2 死锁原因分析

时间序列 事务1 事务2
1  

执行到事务2的update;

走了主键id,锁住一行记录,获得此主键记录的LOCK_X锁

2

执行事务1的update;

走了二级索引:idx_status_cityid_requiretm_dispatchabletm,获得此二级索引where条件

对应的记录的LOCK_X锁,等待此where条件的记录对应主键id的LOCK_X锁;

 
3  

此update语句set了列status,等待二级索idx_status_cityid_requiretm_dispatchabletm对应记录的LOCK_X锁

 

4 至此,产生了相互等待,形成死锁

三、优化方案

3.1 下面是常见多可行优化方案

1. 控制并发顺序

控制这2个update并发,不产生并发,此方案对于我们不可行。

2. 优化sql 

2.1 在update条件中加上主键id

我们采用此中方法,把第一个sql的update的where 后加入id=?

2.2 把update 分解,先根据查询出主键id,然后根据主键id 去update

====>此方案推荐使用。

3. 优化表索引

目前业务需求用到很多索引,此方案在当前不宜采用.

关键思路:二级索引的更新都有对对应记录的主键行加锁。

四、收获是什么

1.无论前台后台的程序,建议都不应该存在仅根据非主键的几个字段就要update/delete的场景。即使有,尽量应该改为先把要更新的记录先根据主键id查出来然后逐条按主键id更新。

2.为以后的db的设计和业务开发提供清晰的思路。