死锁问题

问题

线上每天会报十几个死锁报警,记录如下

Deadlock found when trying to get lock; try restarting transaction

分析

查看报警,定位到死锁发生地代码:

@Override
    public ActionResultDTO<Boolean> updateSuiteCity(int spuId) {
        if (spuId <= 0) {
            log.warn("invalid spuId:{}", spuId);
            return ActionResultDTO.buildInvalidArgumentResult("商品ID不合法");
        }
        try {
            // 获取最新计算出来的适用城市
            List<Integer> newCityIds = spuLocalService.computeCurrentSuiteCity(spuId);
            // 获取数据库中的适用城市
            List<Integer> currentCityIds = spuLocalService.querySuiteCity(spuId);

            //需要删除的适用城市
            List<Integer> removeSuitCityId = ListUtils.removeAll(currentCityIds, newCityIds);
            if (CollectionUtils.isNotEmpty(removeSuitCityId)) {
                preSpuCityDao.batchDeletePreSpuCities(spuId, SpuCityEnum.SuitCity.getValue(), removeSuitCityId);
            }

            //需要新增的适用城市
            List<Integer> addNewCityIds = ListUtils.removeAll(newCityIds, currentCityIds);
            if (CollectionUtils.isNotEmpty(addNewCityIds)) {
                List<PreSpuCity> preSpuCities = generatePreSpuCities(spuId, addNewCityIds);
                preSpuCityDao.batchSavePreSpuCities(preSpuCities);
            }
        } catch (Exception e) {
            log.error("failed to update suite city, spuId:{}", spuId, e);
            return ActionResultDTO.buildExceptionResult(e, "更新适用城市失败");
        }

        return ActionResultDTO.buildSuccessResult();
    }

程序逻辑:

  • 计算spu的适用城市
  • 查询spu数据库里面的使用城市
  • 计算出需要软删的适用城市,批量update
  • 计算出需要新增的适用城市,批量insert

Dao层sql语句:

 <update id="batchDeletePreSpuCities" parameterType="map">
        update
        <include refid="TBL_NAME"/>
        set
        Status = 0
        where SpuID = #{spuId} and Type = #{type} and CityID IN
        <foreach collection="cities" item="cityId" open="(" close=")" separator=",">
            #{cityId}
        </foreach>
    </update>
<insert id="batchSavePreSpuCities" parameterType="map">
        <if test="preSpuCities != null and preSpuCities.size() > 0">
            INSERT INTO
            <include refid="TBL_NAME"/>
            (SpuID, CityID, Type, Status, AddTime, UpdateTime)
            VALUES
            <foreach collection="preSpuCities" item="preSpuCity" open="(" separator="),(" close=")">
                #{preSpuCity.spuId}, #{preSpuCity.cityId}, #{preSpuCity.type}, #{preSpuCity.status}, NOW(), NOW()
            </foreach>
            ON DUPLICATE KEY UPDATE
            Status = VALUES(Status),
            UpdateTime=NOW()
        </if>
    </insert>

表结构:

CREATE TABLE `OP_PRE_SpuCity` (
  `ID` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '自增ID',
  `SpuID` int(11) NOT NULL COMMENT 'Spu ID',
  `CityID` int(11) NOT NULL COMMENT 'City ID',
  `Type` tinyint(4) NOT NULL COMMENT '类型,1适用城市,2销售城市,3出发地,4目的地,5取还城市',
  `Status` tinyint(11) NOT NULL COMMENT '0删除1正常',
  `AddTime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '添加时间',
  `UpdateTime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '最近修改时间',
  PRIMARY KEY (`ID`),
  UNIQUE KEY `UQ_Spu_City_Type` (`SpuID`,`CityID`,`Type`),
  KEY `IX_UpdateTime` (`UpdateTime`),
  KEY `IX_Spu_Type` (`SpuID`,`Type`)
) ENGINE=InnoDB AUTO_INCREMENT=13499 DEFAULT CHARSET=utf8 COMMENT='SPU城市映射制作表';

对于第一步删除适用城市语句,我们通过执行sql,看一下结果:

从上面的执行结果可以看出,删除适用城市这一步用到了表的二级索引。 查看系统的隔离级别:

可以看出目前数据库隔离级别为可重复读。

对于隔离级别为可重复读的数据库写操作,数据库加锁示例:

可以看出不仅加了写锁,同时还加了3个gap锁。

回到上面的逻辑层,删除操作时锁住了锁住了spuId=1219503, type=1的几行记录,同时加了gap锁,逻辑层之后想插入数据时,就可能造成锁等待导致失败从而重启事务。

通过更改逻辑,将两步合为一步,代码如下:

@Override
    public ActionResultDTO<Boolean> updateSuiteCity(int spuId) {
        if (spuId <= 0) {
            log.warn("invalid spuId:{}", spuId);
            return ActionResultDTO.buildInvalidArgumentResult("商品ID不合法");
        }
        try {
            log.info("update suite City, spuId:{}", spuId);
            // 获取最新计算出来的适用城市
            List<Integer> newCityIds = spuLocalService.computeCurrentSuiteCity(spuId);
            // 获取数据库中的适用城市
            List<Integer> currentCityIds = spuLocalService.querySuiteCity(spuId);

            List<PreSpuCity> preSpuCities = Lists.newArrayList();
            //需要删除的适用城市
            List<Integer> removeSuitCityId = ListUtils.removeAll(currentCityIds, newCityIds);
            if (CollectionUtils.isNotEmpty(removeSuitCityId)) {
                // 生成需要删除的适用城市(status=0, 软删)
                List<PreSpuCity> deletePreSpuCities = generatePreSpuCities(spuId, removeSuitCityId, 0);
                if (CollectionUtils.isNotEmpty(deletePreSpuCities)) {
                    preSpuCities.addAll(deletePreSpuCities);
                }
            }

            //需要新增的适用城市
            List<Integer> addNewCityIds = ListUtils.removeAll(newCityIds, currentCityIds);
            if (CollectionUtils.isNotEmpty(addNewCityIds)) {
                List<PreSpuCity> newSpuCities = generatePreSpuCities(spuId, addNewCityIds);
                if (CollectionUtils.isNotEmpty(newSpuCities)) {
                    preSpuCities.addAll(newSpuCities);
                }
            }
            log.info("save preSpuCities, spuId:{}", spuId);
            preSpuCityDao.batchSavePreSpuCities(preSpuCities);
        } catch (Exception e) {
            log.error("failed to update suite city, spuId:{}", spuId, e);
            return ActionResultDTO.buildExceptionResult(e, "更新适用城市失败");
        }

        return ActionResultDTO.buildSuccessResult();
    }

之后就没有报死锁的问题了。

参考文档:

http://www.cnblogs.com/LBSer/p/5183300.html http://hedengcheng.com/?p=771#_Toc374698322