Dubbo - 负载均衡

今天通过源码的方式来了解一下Dubbo的负载均衡是怎么处理的。

首先我们看一下负载均衡部分的代码结构：

LoadBalance : 接口，定义了一个选择的方法。
AbstractLoadBalance ：所有负载均衡实现的抽象类，实现了LoadBalance接口，定义了获取权重和计算权重的方法，定义了一个子类需要实现的选择方法。
RandomLoadBalance：随机，按照权重设置随机概率。
RoundRobinLoadBalance ：轮询，按公约后的权重设置轮询比率。
LeastActiveLoadBalance ：最少活跃调用数，相同活跃数的随机，活跃数指调用前后计数差。
ConsistentHashLoadBalance ：一致性 Hash，相同参数的请求总是发到同一提供者。

缺省的情况下，是按照Random的方式进行随机调用。
如果上面四种负载算法都不能满足我们的业务场景，那么可以通过实现LoadBalance方法，来实现自己的算法进行选择。这也是Dubbo的SPI的一个展示点，非常的灵活，但又不失稳重。

一、AbstractLoadBalance

首先来看一下这个抽象类的源码：

public abstract class AbstractLoadBalance implements LoadBalance {

    static int calculateWarmupWeight(int uptime, int warmup, int weight) {
        int ww = (int) ((float) uptime / ((float) warmup / (float) weight));
        return ww < 1 ? 1 : (ww > weight ? weight : ww);
    }

    @Override
    public <T> Invoker<T> select(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        if (invokers == null || invokers.isEmpty()) {
            return null;
        }
        if (invokers.size() == 1) {
            return invokers.get(0);
        }
        return doSelect(invokers, url, invocation);
    }

    protected abstract <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation);

    protected int getWeight(Invoker<?> invoker, Invocation invocation) {
        int weight = invoker.getUrl().getMethodParameter(invocation.getMethodName(), Constants.WEIGHT_KEY, Constants.DEFAULT_WEIGHT);
        if (weight > 0) {
            long timestamp = invoker.getUrl().getParameter(Constants.REMOTE_TIMESTAMP_KEY, 0L);
            if (timestamp > 0L) {
                int uptime = (int) (System.currentTimeMillis() - timestamp);
                int warmup = invoker.getUrl().getParameter(Constants.WARMUP_KEY, Constants.DEFAULT_WARMUP);
                if (uptime > 0 && uptime < warmup) {
                    weight = calculateWarmupWeight(uptime, warmup, weight);
                }
            }
        }
        return weight;
    }

}

首先是一个默认的'计算热预热权重'，这里非常好奇，为什么叫这么个名字。后来想想应该是一个服务运行的时间越长，也就越稳定，那对应的权重也就越高。

分析一下calculateWarmupWeight方法可能会更加清晰：

int ww = (int) ((float) uptime / ((float) warmup / (float) weight));

uptime ：系统运行时间
warmup : 预热时间
weight : 权重
所以整体的看就是运行的时间越长，那最后计算出来的权重值也就越大。

这里也要重点关注一下getWeight方法，首先是获得系统设置的权重值，如果大于0则进行计算，因为如果要是小于等于0的话，那就是权重非常低，也就没有再计算的意义了。

这里是获取系统的启动时间，参数值是从URL中获取到的。

long timestamp = invoker.getUrl().getParameter(Constants.REMOTE_TIMESTAMP_KEY, 0L);

有了这个值，后面才能通过当前时间-启动时间来得到系统的运行时间。然后在运行时间大于0，也就是已经启动了，然后运行时间小于预热时间的，会进行权重的计算。

思考了上面的代码，然后也看了一些大佬的分析，这里是Dubbo针对系统契合环境做的一些优化，因为一个系统从启动到完美运行，是需要一段时间的，也就是这个预热时间。所以在这里预热时间段内的请求，是不能完全根据设置的权重值来进行负载的，需要进行一下预热程度的计算。这样就能够保证系统在真正的完美运行时间，不会处理太多的请求

在AbstractLoadBalance中还定义了一个抽象方法，供子类来根据不同的负载算法自行实现，内容会在下面不通的实现中进行说明。

二、RandomLoadBalance

官方给的解释是：在一个截面上碰撞的概率高，但调用量越大分布越均匀，而且按概率使用权重后也比较均匀，有利于动态调整提供者权重。

先看一下源码：

public class RandomLoadBalance extends AbstractLoadBalance {

    public static final String NAME = "random";

    @Override
    protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        int length = invokers.size(); // Number of invokers
        boolean sameWeight = true; // Every invoker has the same weight?
        int firstWeight = getWeight(invokers.get(0), invocation);
        int totalWeight = firstWeight; // The sum of weights
        for (int i = 1; i < length; i++) {
            int weight = getWeight(invokers.get(i), invocation);
            totalWeight += weight; // Sum
            if (sameWeight && weight != firstWeight) {
                sameWeight = false;
            }
        }
        if (totalWeight > 0 && !sameWeight) {
            // If (not every invoker has the same weight & at least one invoker's weight>0), select randomly based on totalWeight.
            int offset = ThreadLocalRandom.current().nextInt(totalWeight);
            // Return a invoker based on the random value.
            for (int i = 0; i < length; i++) {
                offset -= getWeight(invokers.get(i), invocation);
                if (offset < 0) {
                    return invokers.get(i);
                }
            }
        }
        // If all invokers have the same weight value or totalWeight=0, return evenly.
        return invokers.get(ThreadLocalRandom.current().nextInt(length));
    }

}

代码还是比较清晰，总体分为一下几步吧：
1.统计目前有几个服务提供者，然后计算总的权重值
2.如果每一个服务提供者的权重都一样，那就随机选择一个
3.如果权重不同，则从总权重中取一个随机值，取随机值和权重值的差值，如果小于0则返回。

三、RoundRobinLoadBalance

官方给的解释是：存在慢的提供者累积请求的问题，比如：第二台机器很慢，但没挂，当请求调到第二台时就卡在那，久而久之，所有请求都卡在调到第二台上。

最新的源码可能和之前的实现有所不同，下面来说说我的这个版本的源码。

首先定义了一个轮询的对象WeightedRoundRobin：

 protected static class WeightedRoundRobin {
        private int weight;
        private AtomicLong current = new AtomicLong(0);
        private long lastUpdate;
        public int getWeight() {
            return weight;
        }
        public void setWeight(int weight) {
            this.weight = weight;
            current.set(0);
        }
        public long increaseCurrent() {
            return current.addAndGet(weight);
        }
        public void sel(int total) {
            current.addAndGet(-1 * total);
        }
        public long getLastUpdate() {
            return lastUpdate;
        }
        public void setLastUpdate(long lastUpdate) {
            this.lastUpdate = lastUpdate;
        }
    }

整个轮询过程都是通过这个对象来计算的，里面通过一个AtomicLong来计算自增，然后记录了上一次的轮询值等信息。

然后这里面有三个成员对象需要特别说明一下：

private static int RECYCLE_PERIOD = 60000;
- 轮询重新计算点，如果上次轮询时间和当前时间差60000毫秒的话，就需要重新开始轮询的计算了。
private ConcurrentMap<String, ConcurrentMap<String, WeightedRoundRobin>> methodWeightMap = new ConcurrentHashMap<String, ConcurrentMap<String, WeightedRoundRobin>>();
- 一个嵌套的ConcurrentHashMap，保存了接口方法和服务提供者的轮询对象的关系。这里一个方法接口是第一层的过滤，然后每一个方法都可以有多个服务提供者，每个服务提供者又存在多个不同的轮询对象
private AtomicBoolean updateLock = new AtomicBoolean();
- 用原子值来做的锁处理，用于更新服务提供者和轮询对象的关系的时候使用

重头戏还是doSelect方法，因为这个版本和之前更新变化挺大的，所有有理解不准确的地方，希望大家予以指出：

protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        String key = invokers.get(0).getUrl().getServiceKey() + "." + invocation.getMethodName();
        ConcurrentMap<String, WeightedRoundRobin> map = methodWeightMap.get(key);
        if (map == null) {
            methodWeightMap.putIfAbsent(key, new ConcurrentHashMap<String, WeightedRoundRobin>());
            map = methodWeightMap.get(key);
        }
        int totalWeight = 0;
        long maxCurrent = Long.MIN_VALUE;
        long now = System.currentTimeMillis();
        Invoker<T> selectedInvoker = null;
        WeightedRoundRobin selectedWRR = null;
        for (Invoker<T> invoker : invokers) {
            String identifyString = invoker.getUrl().toIdentityString();
            WeightedRoundRobin weightedRoundRobin = map.get(identifyString);
            int weight = getWeight(invoker, invocation);
            if (weight < 0) {
                weight = 0;
            }
            if (weightedRoundRobin == null) {
                weightedRoundRobin = new WeightedRoundRobin();
                weightedRoundRobin.setWeight(weight);
                map.putIfAbsent(identifyString, weightedRoundRobin);
                weightedRoundRobin = map.get(identifyString);
            }
            if (weight != weightedRoundRobin.getWeight()) {
                //weight changed
                weightedRoundRobin.setWeight(weight);
            }
            long cur = weightedRoundRobin.increaseCurrent();
            weightedRoundRobin.setLastUpdate(now);
            if (cur > maxCurrent) {
                maxCurrent = cur;
                selectedInvoker = invoker;
                selectedWRR = weightedRoundRobin;
            }
            totalWeight += weight;
        }
        if (!updateLock.get() && invokers.size() != map.size()) {
            if (updateLock.compareAndSet(false, true)) {
                try {
                    // copy -> modify -> update reference
                    ConcurrentMap<String, WeightedRoundRobin> newMap = new ConcurrentHashMap<String, WeightedRoundRobin>();
                    newMap.putAll(map);
                    Iterator<Entry<String, WeightedRoundRobin>> it = newMap.entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<String, WeightedRoundRobin> item = it.next();
                        if (now - item.getValue().getLastUpdate() > RECYCLE_PERIOD) {
                            it.remove();
                        }
                    }
                    methodWeightMap.put(key, newMap);
                } finally {
                    updateLock.set(false);
                }
            }
        }
        if (selectedInvoker != null) {
            selectedWRR.sel(totalWeight);
            return selectedInvoker;
        }
        // should not happen here
        return invokers.get(0);
    }

首先通过请求方法来获取服务提供者和轮询对象的Map，如果不存在就创建一个
这里有一个long maxCurrent = Long.MIN_VALUE;挺有意思的，目的是为了判断轮询值的底线
定义一个指向最终
服务提供者的selectedInvoker，和对应的selectedWRR轮询对象
接下来就是整体循环传进来的所有服务提供者列表，然后在上面活动到的map中去寻找对应的轮询对象，计算权重值等信息
取得到当前的轮询对象之后，进行原子自增操作，记录操作时间，这个时候就要判断轮询原子值是否大于maxCurrent的值了，如果大于证明当前这个服务提供者的Invoker是有效的，然后将maxCurrent的值指向这个轮询值，同时把invoker和轮询对象的引用指过来。

　　为什么要把maxCurrent的值指向cur ??

　　因为这样后续Invoker的轮询值没有当前这个轮询值大的时候，那么就不会被选中，反过来说也就是这里会在所有的Invoker列表中找到一个轮询值最大的那个

　　也就是说每次调用doSelect方法的时候都是取轮询值最大的那个Invoker作为返回。

　　思考一个问题，如果每次来都取最大的，那怎么保证是轮询的呢？

下面把代码拆分来看一段：

if (!updateLock.get() && invokers.size() != map.size()) {
            if (updateLock.compareAndSet(false, true)) {
                try {
                    // copy -> modify -> update reference
                    ConcurrentMap<String, WeightedRoundRobin> newMap = new ConcurrentHashMap<String, WeightedRoundRobin>();
                    newMap.putAll(map);
                    Iterator<Entry<String, WeightedRoundRobin>> it = newMap.entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<String, WeightedRoundRobin> item = it.next();
                        if (now - item.getValue().getLastUpdate() > RECYCLE_PERIOD) {
                            it.remove();
                        }
                    }
                    methodWeightMap.put(key, newMap);
                } finally {
                    updateLock.set(false);
                }
            }
        }

看GitHub上这段代码应该是10月份左右调整的，但是上面的代码在遍历Invoker的时候，如果与Map数目不相等的时候，就会对map中的轮询对象进行处理。所以这段逻辑是为了在有Invoker下线，或者服务不可用的时候，将其从轮询队列中剔除。

最后来回答一下上面的问题：每次都取轮询值最大的，那怎么保证的轮询呢？
答案就在下面的这段代码中：

if (selectedInvoker != null) {
       selectedWRR.sel(totalWeight);
       return selectedInvoker;
}

如果在上面遍历Invoker的过程中已经找到了轮询值最大的那个Invoker，就将其对应的轮询值调用sel方法将其设置为一个负值

然后每次如果不被选中，就会加上权重值。所以正常情况下每个Invoker的轮询值都应该是负数的且大于Long.MIN_VALUE

四、LeastActiveLoadBalance

最少活跃调用数：相同活跃数的随机，活跃数指调用前后计数差。使慢的提供者收到更少请求，因为越慢的提供者的调用前后计数差会越大。

还是先看一下相关源码，里面有相当多的注释可以帮助我们理解：

protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        int length = invokers.size(); // Number of invokers
        int leastActive = -1; // The least active value of all invokers
        int leastCount = 0; // The number of invokers having the same least active value (leastActive)
        int[] leastIndexes = new int[length]; // The index of invokers having the same least active value (leastActive)
        int totalWeight = 0; // The sum of with warmup weights
        int firstWeight = 0; // Initial value, used for comparision
        boolean sameWeight = true; // Every invoker has the same weight value?
        for (int i = 0; i < length; i++) {
            Invoker<T> invoker = invokers.get(i);
            int active = RpcStatus.getStatus(invoker.getUrl(), invocation.getMethodName()).getActive(); // Active number
            int afterWarmup = getWeight(invoker, invocation);
            if (leastActive == -1 || active < leastActive) { // Restart, when find a invoker having smaller least active value.
                leastActive = active; // Record the current least active value
                leastCount = 1; // Reset leastCount, count again based on current leastCount
                leastIndexes[0] = i; // Reset
                totalWeight = afterWarmup; // Reset
                firstWeight = afterWarmup; // Record the weight the first invoker
                sameWeight = true; // Reset, every invoker has the same weight value?
            } else if (active == leastActive) { // If current invoker's active value equals with leaseActive, then accumulating.
                leastIndexes[leastCount++] = i; // Record index number of this invoker
                totalWeight += afterWarmup; // Add this invoker's with warmup weight to totalWeight.
                // If every invoker has the same weight?
                if (sameWeight && i > 0
                        && afterWarmup != firstWeight) {
                    sameWeight = false;
                }
            }
        }
        // assert(leastCount > 0)
        if (leastCount == 1) {
            // If we got exactly one invoker having the least active value, return this invoker directly.
            return invokers.get(leastIndexes[0]);
        }
        if (!sameWeight && totalWeight > 0) {
            // If (not every invoker has the same weight & at least one invoker's weight>0), select randomly based on totalWeight.
            int offsetWeight = ThreadLocalRandom.current().nextInt(totalWeight) + 1;
            // Return a invoker based on the random value.
            for (int i = 0; i < leastCount; i++) {
                int leastIndex = leastIndexes[i];
                offsetWeight -= getWeight(invokers.get(leastIndex), invocation);
                if (offsetWeight <= 0) {
                    return invokers.get(leastIndex);
                }
            }
        }
        // If all invokers have the same weight value or totalWeight=0, return evenly.
        return invokers.get(leastIndexes[ThreadLocalRandom.current().nextInt(leastCount)]);
    }

代码也是比较长，有英文注释的地方就不多说了，说一说整体思路和必要的地方。
1.前7个变量是为了在遍历Invoker的时候使用的。
2.循环中会先做两件事，第一件是获取当前这个Invoker的活跃调用数，第二件是计算对应的权重值。
3.重点理解一下if (leastActive == -1 || active < leastActive)
　　首先要明确我们这个方法就是为了找出那些活跃调用数最少的Invoker，所以这里的作用是用来判断有没有比当前leastActive变量记录的活跃调用数还少的Invoker

　　　如果有那么就用这个更少的作为比较的基础，对后面的Invoker进行比较。
4.else if (active == leastActive)

　　　上面的if操作明白了，这个else也就清楚了，就是如果遇到和现在变量相同活跃数的Invoker，就记录到最少活跃数连接的数组中，为后续选择Invoker做铺垫

在这个循环Invoker的代码块中主要的就是这些了，但是有一个问题不知道大家在看的时候有没有注意到

就是int active = RpcStatus.getStatus(invoker.getUrl(), invocation.getMethodName()).getActive(); 这里是怎么获取到的活跃连接数呢？

带着疑问我们进到RpcStatus这个类中去看看，可以注意到类注释中有这么一段：

/**
 * URL statistics. (API, Cached, ThreadSafe)
 *
 * @see org.apache.dubbo.rpc.filter.ActiveLimitFilter
 * @see org.apache.dubbo.rpc.filter.ExecuteLimitFilter
 * @see org.apache.dubbo.rpc.cluster.loadbalance.LeastActiveLoadBalance
 */

看到ActiveLimitFilter和ExecuteLimitFilter了吧，简单的说就是Dubbo其实自己定义了一些Filter，上面这两个Filter就是在方法调用的过程中会被调用到的

也就自然的可以去维护那个active的原子变量，所以我们这里在通过最少活跃连接数来进行负载的时候，就可以直接来取这个值了

想要了解细节的可以进入这两个Filter中看看，这里就不赘述了。

我们还是回到这个doSelect的方法中来，上面一个遍历已经是把现在的Invoker进行了第一次的筛选了，下面就是从筛选结果中选择一个Invoker的过程了。
1.if (leastCount == 1) ：如果当前只有一个最小活跃数的Invoker，那么直接返回这个即可了。
2.if (!sameWeight && totalWeight > 0) ：如果有多个活跃数相同的Invoker且权重不同，那么下面根据权重选择一个返回
3.return invokers.get(leastIndexes[ThreadLocalRandom.current().nextInt(leastCount)]) ：如果权重相同，则在活跃数相同的里面随机选择一个

五、ConsistentHashLoadBalance

一致性哈希：相同参数的请求总是发到同一提供者

protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        String methodName = RpcUtils.getMethodName(invocation);
        String key = invokers.get(0).getUrl().getServiceKey() + "." + methodName;
        int identityHashCode = System.identityHashCode(invokers);
        ConsistentHashSelector<T> selector = (ConsistentHashSelector<T>) selectors.get(key);
        if (selector == null || selector.identityHashCode != identityHashCode) {
            selectors.put(key, new ConsistentHashSelector<T>(invokers, methodName, identityHashCode));
            selector = (ConsistentHashSelector<T>) selectors.get(key);
        }
        return selector.select(invocation);
    }

首先还是这个doSelect方法，这里使用了一个ConsistentHashSelector一致性哈希的选择器的对象，用于选择Invoker。代码比较简单，重点内容是这个选择器。

这里对于一致性哈希的原理就不再赘述了，网上介绍的文章很多。还是从代码的角度来看看Dubbo是怎么实现的：

private static final class ConsistentHashSelector<T> {

        private final TreeMap<Long, Invoker<T>> virtualInvokers;

        private final int replicaNumber;

        private final int identityHashCode;

        private final int[] argumentIndex;

        ConsistentHashSelector(List<Invoker<T>> invokers, String methodName, int identityHashCode) {
            this.virtualInvokers = new TreeMap<Long, Invoker<T>>();
            this.identityHashCode = identityHashCode;
            URL url = invokers.get(0).getUrl();
            this.replicaNumber = url.getMethodParameter(methodName, "hash.nodes", 160);
            String[] index = Constants.COMMA_SPLIT_PATTERN.split(url.getMethodParameter(methodName, "hash.arguments", "0"));
            argumentIndex = new int[index.length];
            for (int i = 0; i < index.length; i++) {
                argumentIndex[i] = Integer.parseInt(index[i]);
            }
            for (Invoker<T> invoker : invokers) {
                String address = invoker.getUrl().getAddress();
                for (int i = 0; i < replicaNumber / 4; i++) {
                    byte[] digest = md5(address + i);
                    for (int h = 0; h < 4; h++) {
                        long m = hash(digest, h);
                        virtualInvokers.put(m, invoker);
                    }
                }
            }
        }

这个选择器中有两个参数是比较重要的：

hash.nodes ：虚拟节点数
hash.arguments ：根据哪些参数来生产hash值

下面就是遍历每一个Invoker，生产对应的虚拟节点，也比较好理解，最终选择一个对应的Invoker返回

posted @ 2020-03-13 16:16 SyrupzZ 阅读(173) 评论(0) 收藏举报

刷新页面返回顶部

Syrups