Redisson分布式锁学习总结：公平锁 RedissonFairLock#lock 获取锁源码分析

原文链接：Redisson分布式锁学习总结：公平锁 RedissonFairLock#lock 获取锁源码分析

一、RedissonFairLock#lock 源码分析

public class RedissonFairLockDemo {

    public static void main(String[] args) {
        RedissonClient client = RedissonClientUtil.getClient("");
        RLock fairLock = client.getFairLock("myLock");
        // 最常见的使用方法
        try {
            fairLock.lock();
        }catch (Exception e){
            e.printStackTrace();
        }finally {
            fairLock.unlock();
        }
    }
}

1、根据锁key计算出 slot，一个slot对应的是redis集群的一个节点

RedissonFairLock 其实是 RedissonLock 的子类，它主要是基于 RedissonLock 做的扩展，主要扩展在于加锁和释放锁的地方，其他的逻辑都直接复用 RedissonLock：例如加锁前计算slot、watchdog机制等等。

2、RedissonFairLock 之 lua 脚本加锁

RedissonFairLock#tryLockInnerAsync：里面有两段 lua 脚本，我们现在只需要关注第二段即可。


if (command == RedisCommands.EVAL_LONG) {
    return evalWriteAsync(getName(), LongCodec.INSTANCE, command,
            // remove stale threads
            "while true do " +
                "local firstThreadId2 = redis.call('lindex', KEYS[2], 0);" +
                "if firstThreadId2 == false then " +
                    "break;" +
                "end;" +

                "local timeout = tonumber(redis.call('zscore', KEYS[3], firstThreadId2));" +
                "if timeout <= tonumber(ARGV[4]) then " +
                    // remove the item from the queue and timeout set
                    // NOTE we do not alter any other timeout
                    "redis.call('zrem', KEYS[3], firstThreadId2);" +
                    "redis.call('lpop', KEYS[2]);" +
                "else " +
                    "break;" +
                "end;" +
            "end;" +

            // check if the lock can be acquired now
            "if (redis.call('exists', KEYS[1]) == 0) " +
                "and ((redis.call('exists', KEYS[2]) == 0) " +
                    "or (redis.call('lindex', KEYS[2], 0) == ARGV[2])) then " +

                // remove this thread from the queue and timeout set
                "redis.call('lpop', KEYS[2]);" +
                "redis.call('zrem', KEYS[3], ARGV[2]);" +

                // decrease timeouts for all waiting in the queue
                "local keys = redis.call('zrange', KEYS[3], 0, -1);" +
                "for i = 1, #keys, 1 do " +
                    "redis.call('zincrby', KEYS[3], -tonumber(ARGV[3]), keys[i]);" +
                "end;" +

                // acquire the lock and set the TTL for the lease
                "redis.call('hset', KEYS[1], ARGV[2], 1);" +
                "redis.call('pexpire', KEYS[1], ARGV[1]);" +
                "return nil;" +
            "end;" +

            // check if the lock is already held, and this is a re-entry
            "if redis.call('hexists', KEYS[1], ARGV[2]) == 1 then " +
                "redis.call('hincrby', KEYS[1], ARGV[2],1);" +
                "redis.call('pexpire', KEYS[1], ARGV[1]);" +
                "return nil;" +
            "end;" +

            // the lock cannot be acquired
            // check if the thread is already in the queue
            "local timeout = redis.call('zscore', KEYS[3], ARGV[2]);" +
            "if timeout ~= false then " +
                // the real timeout is the timeout of the prior thread
                // in the queue, but this is approximately correct, and
                // avoids having to traverse the queue
                "return timeout - tonumber(ARGV[3]) - tonumber(ARGV[4]);" +
            "end;" +

            // add the thread to the queue at the end, and set its timeout in the timeout set to the timeout of
            // the prior thread in the queue (or the timeout of the lock if the queue is empty) plus the
            // threadWaitTime
            "local lastThreadId = redis.call('lindex', KEYS[2], -1);" +
            "local ttl;" +
            "if lastThreadId ~= false and lastThreadId ~= ARGV[2] then " +
                "ttl = tonumber(redis.call('zscore', KEYS[3], lastThreadId)) - tonumber(ARGV[4]);" +
            "else " +
                "ttl = redis.call('pttl', KEYS[1]);" +
            "end;" +
            "local timeout = ttl + tonumber(ARGV[3]) + tonumber(ARGV[4]);" +
            "if redis.call('zadd', KEYS[3], timeout, ARGV[2]) == 1 then " +
                "redis.call('rpush', KEYS[2], ARGV[2]);" +
            "end;" +
            "return ttl;",
            Arrays.asList(getName(), threadsQueueName, timeoutSetName),
            internalLockLeaseTime, getLockName(threadId), wait, currentTime);
}

lua 脚本虽然很长，但其实作者给的注释也是非常的清晰，让我们知道lua脚本每一步的含义，所以下面我将讲解每一个分支究竟利用redis命令做了什么。

2.1、KEYS

Arrays.asList(getName(), threadsQueueName, timeoutSetName)：

getName(): 锁key
threadsQueueName：prefixName("redisson_lock_queue", name)，用于锁排队
timeoutSetName：prefixName("redisson_lock_timeout", name)，用于队列中每个客户端的等待超时时间

KEYS：["myLock","redisson_lock_queue:{myLock}","redisson_lock_timeout:{myLock}"]

2.2、ARGVS

internalLockLeaseTime, getLockName(threadId), wait, currentTime：

internalLockLeaseTime：其实就是 watchdog 的超时时间，默认是30000毫秒，可看 Config#lockWatchdogTimeout。
```
private long lockWatchdogTimeout = 30 * 1000;
```
getLockName(threadId)：return id + ":" + threadId，客户端ID(UUID):线程ID(threadId)

wait：就是 threadWaitTime，默认30_0000毫秒

public RedissonFairLock(CommandAsyncExecutor commandExecutor, String name) {
    this(commandExecutor, name, 60000*5);
}

public RedissonFairLock(CommandAsyncExecutor commandExecutor, String name, long threadWaitTime) {
    super(commandExecutor, name);
    this.commandExecutor = commandExecutor;
    this.threadWaitTime = threadWaitTime;
    threadsQueueName = prefixName("redisson_lock_queue", name);
    timeoutSetName = prefixName("redisson_lock_timeout", name);
}

currentTime：当前时间时间戳

ARGVS：[30_000毫秒,"UUID:threadId",30_0000毫秒,当前时间戳]

2.3、lua 脚本分析

1、分支一：清理过期的等待线程

场景：

这个死循环的作用主要用于清理过期的等待线程，主要避免下面场景，避免无效客户端占用等待队列资源

获取锁失败，然后进入等待队列，但是网络出现问题，那么后续很有可能就不能继续正常获取锁了。
获取锁失败，然后进入等待队列，但是之后客户端所在服务器宕机了。

"while true do " +
    "local firstThreadId2 = redis.call('lindex', KEYS[2], 0);" +
    "if firstThreadId2 == false then " +
        "break;" +
    "end;" +

    "local timeout = tonumber(redis.call('zscore', KEYS[3], firstThreadId2));" +
    "if timeout <= tonumber(ARGV[4]) then " +
        // remove the item from the queue and timeout set
        // NOTE we do not alter any other timeout
        "redis.call('zrem', KEYS[3], firstThreadId2);" +
        "redis.call('lpop', KEYS[2]);" +
    "else " +
        "break;" +
    "end;" +
"end;" +

开启死循环
利用 lindex 命令判断等待队列中第一个元素是否存在，如果存在，直接跳出循环
```
lidex redisson_lock_queue:{myLock} 0
```
如果等待队列中第一个元素不为空（例如返回了LockName，即客户端UUID拼接线程ID），利用 zscore 在超时记录集合(sorted set) 中获取对应的超时时间
```
zscore redisson_lock_timeout:{myLock} UUID:threadId
```
如果超时时间已经小于当前时间，那么首先从超时集合中移除该节点，接着也在等待队列中弹出第一个节点
```
zrem redisson_lock_timeout:{myLock} UUID:threadId
lpop redisson_lock_queue:{myLock}
```
如果等待队列中的第一个元素还未超时，直接退出死循环

2、分支二：检查是否可成功获取锁

场景：

其他客户端刚释放锁，并且等待队列为空
其他客户端刚释放锁，并且等待队列中的第一个元素就是当前客户端当前线程

// check if the lock can be acquired now
"if (redis.call('exists', KEYS[1]) == 0) " +
    "and ((redis.call('exists', KEYS[2]) == 0) " +
        "or (redis.call('lindex', KEYS[2], 0) == ARGV[2])) then " +

    // remove this thread from the queue and timeout set
    "redis.call('lpop', KEYS[2]);" +
    "redis.call('zrem', KEYS[3], ARGV[2]);" +

    // decrease timeouts for all waiting in the queue
    "local keys = redis.call('zrange', KEYS[3], 0, -1);" +
    "for i = 1, #keys, 1 do " +
        "redis.call('zincrby', KEYS[3], -tonumber(ARGV[3]), keys[i]);" +
    "end;" +

    // acquire the lock and set the TTL for the lease
    "redis.call('hset', KEYS[1], ARGV[2], 1);" +
    "redis.call('pexpire', KEYS[1], ARGV[1]);" +
    "return nil;" +
"end;" +

当前锁还未被获取 and（等待队列不存在 or 等待队列的第一个元素是当前客户端当前线程）

exists myLock：判断锁是否存在

exists redisson_lock_queue:{myLock}：判断等待队列是否为空

lindex redisson_lock_timeout:{myLock} 0：获取等待队列中的第一个元素，用于判断是否等于当前客户端当前线程

如果步骤1满足，从等待队列和超时集合中移除当前线程

lpop redisson_lock_queue:{myLock}：弹出等待队列中的第一个元素，即当前线程

zrem redisson_lock_timeout:{myLock} UUID:threadId：从超时集合中移除当前客户端当前线程

刷新超时集合中，其他元素的超时时间，即更新他们得分数
```
zrange redisson_lock_timeout:{myLock} 0 -1：从超时集合中获取所有的元素
```
遍历，然后执行下面命令更新分数，即超时时间：
```
zincrby redisson_lock_timeout:{myLock} -30w毫秒 keys[i]
```
因为这里的客户端都是调用 lock()方法，就是等待直到最后获取到锁；所以某个客户端可以成功获取锁的时候，要帮其他等待的客户端刷新一下等待时间，不然在分支一的死循环中就被干掉了。
最后，往加锁集合(map) myLock 中加入当前客户端当前线程，加锁次数为1，然后刷新 myLock 的过期时间，返回nil
```
hset myLock UUID:threadId 1：将当前线程加入加锁记录中。
espire myLock 3w毫秒：重置锁的过期时间。
```
加入此节点后，map集合如下：
```
myLock:{
    "UUID:threadId":1
}
```
使用这个map记录加锁次数，主要用于支持可重入加锁。

3、分支三：当前线程曾经获取锁，重复获取锁。

场景：

当前线程已经成功获取过锁，现在重新再次获取锁。
即：Redisson 的公平锁是支持可重入的。

"if redis.call('hexists', KEYS[1], ARGV[2]) == 1 then " +
    "redis.call('hincrby', KEYS[1], ARGV[2],1);" +
    "redis.call('pexpire', KEYS[1], ARGV[1]);" +
    "return nil;" +
"end;" +

利用 hexists 命令判断加锁记录集合中，是否存在当前客户端当前线程
```
hexists myLock UUID:threadId
```

如果存在，那么增加加锁次数，并且刷新锁的过期时间

hincrby myLock UUID:threadId 1：增加加锁次数

pexpire myLock 30000毫秒：刷新锁key的过期时间

4、分支四：当前线程本就在等待队列中，返回等待时间

"local timeout = redis.call('zscore', KEYS[3], ARGV[2]);" +
"if timeout ~= false then " +
    // the real timeout is the timeout of the prior thread
    // in the queue, but this is approximately correct, and
    // avoids having to traverse the queue
    "return timeout - tonumber(ARGV[3]) - tonumber(ARGV[4]);" +
"end;" +

利用 zscore 获取当前线程在超时集合中的超时时间
```
zscore redisson_lock_timeout:{myLock} UUID:threadId
```
返回实际的等待时间为：超时集合里的时间戳-30w毫秒-当前时间戳

5、分支五：当前线程首次尝试获取锁，将当前线程加入到超时集合中，同时放入等待队列中

"local lastThreadId = redis.call('lindex', KEYS[2], -1);" +
"local ttl;" +
"if lastThreadId ~= false and lastThreadId ~= ARGV[2] then " +
    "ttl = tonumber(redis.call('zscore', KEYS[3], lastThreadId)) - tonumber(ARGV[4]);" +
"else " +
    "ttl = redis.call('pttl', KEYS[1]);" +
"end;" +
"local timeout = ttl + tonumber(ARGV[3]) + tonumber(ARGV[4]);" +
"if redis.call('zadd', KEYS[3], timeout, ARGV[2]) == 1 then " +
    "redis.call('rpush', KEYS[2], ARGV[2]);" +
"end;" +
"return ttl;",

利用 lindex 命令获取等待队列中排在最后的线程
```
lindex redisson_lock_queue:{myLock} -1
```
计算 ttl
- 如果等待队列中最后的线程不为空且不是当前线程，根据此线程计算出ttl
```
zscore redisson_lock_timeout:{myLock} lastThreadId：获取等待队列中最后的线程得过期时间

ttl = timeout - 当前时间戳
```
- 如果等待队列中不存在其他的等待线程，直接返回锁key的过期时间
```
ttl = pttl myLock
```

计算timeout，并将当前线程放入超时集合和等待队列中

timeout = ttl + 30w毫秒 + 当前时间戳

zadd redisson_lock_timeout:{myLock} timeout UUID:threadId：放入超时集合

rpush redisson_lock_queue:{myLock} UUID:threadId：如果成功放入超市集合，同时放入等待队列

最后返回ttl

3、watchdog 不断为锁续命

因为 RedissonFairLock 是基于 RedissonLock 做的，所以 watchdog 还是 RedissonLock 那一套。

4、死循环获取锁

因为 RedissonFairLock 是基于 RedissonLock 做的，所以死循环获取锁也还是 RedissonLock 那一套。

5、其他的加锁方式

如果我们需要指定获取锁成功后持有锁的时长，可以执行下面方法，指定 leaseTime

lock.lock(10, TimeUnit.SECONDS);

如果指定了 leaseTime，watchdog就不会再启用了。

如果不但需要指定持有锁的时长，还想避免锁获取失败时的死循环，可以同时指定 leaseTime 和 waitTime

boolean res = lock.tryLock(100, 10, TimeUnit.SECONDS);

如果指定了 waitTime，只会在 waitTime 时间内循环尝试获取锁，超过 waitTime 如果还是获取失败，直接返回false。

posted @ 2022-01-01 16:02 不送花的程序猿阅读(735) 评论(0) 收藏举报

刷新页面返回顶部