Nacos 2.0源码分析-Distro协议概览

温馨提示:
本文内容基于个人学习Nacos 2.0.1版本代码总结而来,因个人理解差异,不保证完全正确。如有理解错误之处欢迎各位拍砖指正,相互学习;转载请注明出处。

什么是Distro协议

今天来分析Nacos中使用的一种叫作Distro的协议,Distro是阿里巴巴内部使用的一种协议,用于实现分布式环境下的数据一致性。协议约定了节点之间通信的数据格式,数据解析规则,数据传输载体。它是一种临时数据一致性协议,所管理的数据仅保留在内存中。

Distro协议用来做什么

Nacos作为一个分布式服务管理平台(其最主要的功能之一),在分布式环境下每个节点上面的服务信息都会有不同的状态,当服务的可用状态变更等一系列的问题都需要通知其他节点,每个节点上的服务列表也需要进行同步。Distro协议就是用于在不同节点之间同步节点之间的服务。除了字面上的同步之外,它还负责向其他节点报告自身的服务状态。事实上也可以看做是一种同步。

本篇内容不设计该协议的具体操作,具体的协议实现可参考《Distro协议详解》一文,本文仅从Nacos中所有关于Distro的类中来看看它能做什么。通过在Idea内搜索Distro开头的类可以发现它有30个类,分别分布在nacos-corenacos-namingnacos-config模块中。本篇只分析nacos-core模块下的内容,因为它已经覆盖了Distro协议的完整流程。

提示:
这里可以先记住一个关键词同步。所谓的同步无非就是从远端获取数据到本地,或者是从本地发送数据到远端,同步的数据在这里肯定就是 服务相关的了。毕竟在官方文档中都是这样写的:”服务(Service)是 Nacos 世界的一等公民”。
本篇介绍的所有内容均是为了服务于同步这个概念的。

Distro协议的核心组件

nacos-core模块下,定义了Distro协议的所有组件。

distro
	component						Distro的一些组件,例如数据存储对象、数据处理器、数据传输代理等
	entity							实体对象
	exception						异常处理
	task							任务相关
		delay						延迟任务相关组件
		execute						任务执行器相关组件
		load						加载任务相关组件
		verify						验证任务相关组件
	DistroConfig.java				Distro配置信息
	DistroConstants.java			Distro常量
	DistroProtocol.java 			Distro协议入口
	

com.alibaba.nacos.core.distributed.distro.component

DistroCallback

Distro回调接口,用于异步处理之后需要回调的场景。

public interface DistroCallback {
    
    /**
     * Callback when distro task execute successfully.
     */
    void onSuccess();
    
    /**
     * Callback when distro task execute failed.
     *
     * @param throwable throwable if execute failed caused by exception
     */
    void onFailed(Throwable throwable);
}

DistroComponentHolder

Distro组件持有者,它内部定义了一些容器(HashMap)来存储Distro协议需要用到的数据,相当于一个大管家。

@Component
public class DistroComponentHolder {
	
    // 存储不同类型的DistroData传输对象
    private final Map<String, DistroTransportAgent> transportAgentMap = new HashMap<>();
    // 存储不同类型的DistroData装载容器
    private final Map<String, DistroDataStorage> dataStorageMap = new HashMap<>();
    // 存储不同类型的Distro失败任务处理器
    private final Map<String, DistroFailedTaskHandler> failedTaskHandlerMap = new HashMap<>();
    // 存储不同类型的DistroData数据处理器
    private final Map<String, DistroDataProcessor> dataProcessorMap = new HashMap<>();
    
    public DistroTransportAgent findTransportAgent(String type) {
        return transportAgentMap.get(type);
    }
    
    public void registerTransportAgent(String type, DistroTransportAgent transportAgent) {
        transportAgentMap.put(type, transportAgent);
    }
    
    public DistroDataStorage findDataStorage(String type) {
        return dataStorageMap.get(type);
    }
    
    public void registerDataStorage(String type, DistroDataStorage dataStorage) {
        dataStorageMap.put(type, dataStorage);
    }
    
    public Set<String> getDataStorageTypes() {
        return dataStorageMap.keySet();
    }
    
    public DistroFailedTaskHandler findFailedTaskHandler(String type) {
        return failedTaskHandlerMap.get(type);
    }
    
    public void registerFailedTaskHandler(String type, DistroFailedTaskHandler failedTaskHandler) {
        failedTaskHandlerMap.put(type, failedTaskHandler);
    }
    
    public void registerDataProcessor(DistroDataProcessor dataProcessor) {
        dataProcessorMap.putIfAbsent(dataProcessor.processType(), dataProcessor);
    }
    
    public DistroDataProcessor findDataProcessor(String processType) {
        return dataProcessorMap.get(processType);
    }
}

DistroDataProcessor

用于处理Distro协议的数据对象。

/**
 * Distro data processor.
 *
 * @author xiweng.yy
 */
public interface DistroDataProcessor {
    
    /**
     * Process type of this processor.
     * 当前处理器可处理的类型
     * @return type of this processor
     */
    String processType();
    
    /**
     * Process received data.
     * 处理接收到的数据
     * @param distroData received data	接收到的数据对象
     * @return true if process data successfully, otherwise false
     */
    boolean processData(DistroData distroData);
    
    /**
     * Process received verify data.
     * 处理接收到的验证类型的数据
     * @param distroData    verify data	被处理的数据
     * @param sourceAddress source server address, might be get data from source server 被处理数据的来源服务器
     * @return true if the data is available, otherwise false
     */
    boolean processVerifyData(DistroData distroData, String sourceAddress);
    
    /**
     * Process snapshot data.
     * 处理快照数据
     * @param distroData snapshot data
     * @return true if process data successfully, otherwise false
     */
    boolean processSnapshot(DistroData distroData);
}

DiustroDataStorage

DistroData的存储器

public interface DistroDataStorage {
    
    /**
     * Set this distro data storage has finished initial step.
	 * 设置当前存储器已经初始化完毕它内部的DistroData
     */
    void finishInitial();
    
    /**
     * Whether this distro data is finished initial.
     * 当前存储器是否已经初始化完毕内部的DistroData
     * <p>If not finished, this data storage should not send verify data to other node.
     *
     * @return {@code true} if finished, otherwise false
     */
    boolean isFinishInitial();
    
    /**
     * Get distro datum.
     * 获取内部的DistroData
     * @param distroKey key of distro datum	数据对应的key
     * @return need to sync datum
     */
    DistroData getDistroData(DistroKey distroKey);
    
    /**
     * Get all distro datum snapshot.
     * 获取内部存储的所有DistroData
     * @return all datum
     */
    DistroData getDatumSnapshot();
    
    /**
     * Get verify datum.
     * 获取所有的DistroData用于验证
     * @return verify datum
     */
    List<DistroData> getVerifyData();
}

DistroFailedTaskHandler

用于Distro任务失败重试

public interface DistroFailedTaskHandler {
    
    /**
     * Build retry task when distro task execute failed.
     * 当Distro任务执行失败可以创建重试任务
     * @param distroKey distro key of failed task	失败任务的distroKey
     * @param action action of task					任务的操作类型
     */
    void retry(DistroKey distroKey, DataOperation action);
}

DistroTransportAgent

DistroData的传输代理,用于发送请求。

public interface DistroTransportAgent {
    
    /**
     * Whether support transport data with callback.
     * 是否支持回调
     * @return true if support, otherwise false
     */
    boolean supportCallbackTransport();
    
    /**
     * Sync data.
     * 同步数据
     * @param data         data			被同步的数据
     * @param targetServer target server同步的目标服务器
     * @return true is sync successfully, otherwise false
     */
    boolean syncData(DistroData data, String targetServer);
    
    /**
     * Sync data with callback.
     * 带回调的同步方法
     * @param data         data
     * @param targetServer target server
     * @param callback     callback
     * @throws UnsupportedOperationException if method supportCallbackTransport is false, should throw {@code
     *                                       UnsupportedOperationException}
     */
    void syncData(DistroData data, String targetServer, DistroCallback callback);
    
    /**
     * Sync verify data.
     * 同步验证数据
     * @param verifyData   verify data
     * @param targetServer target server
     * @return true is verify successfully, otherwise false
     */
    boolean syncVerifyData(DistroData verifyData, String targetServer);
    
    /**
     * Sync verify data.
     * 带回调的同步验证数据
     * @param verifyData   verify data
     * @param targetServer target server
     * @param callback     callback
     * @throws UnsupportedOperationException if method supportCallbackTransport is false, should throw {@code
     *                                       UnsupportedOperationException}
     */
    void syncVerifyData(DistroData verifyData, String targetServer, DistroCallback callback);
    
    /**
     * get Data from target server.
     * 从远程节点获取指定数据
     * @param key          key of data	需要获取数据的key
     * @param targetServer target server远端节点地址
     * @return distro data
     */
    DistroData getData(DistroKey key, String targetServer);
    
    /**
     * Get all datum snapshot from target server.
     * 从远端节点获取全量快照数据
     * @param targetServer target server.
     * @return distro data
     */
    DistroData getDatumSnapshot(String targetServer);
}

com.alibaba.nacos.core.distributed.distro.entity

这里存放了Distro协议的数据对象。

DistroData

Distro协议的核心对象,协议交互过程中的数据传输将使用此对象,它的设计也可以看做是一个容器,后期将会经常看见他。

public class DistroData {
    // 数据的key
    private DistroKey distroKey;
    // 数据的操作类型,也可以理解为是什么操作产生了此数据,或此数据用于什么操作
    private DataOperation type;
    // 数据的字节数组
    private byte[] content;
    
    public DistroData() {
    }
    
    public DistroData(DistroKey distroKey, byte[] content) {
        this.distroKey = distroKey;
        this.content = content;
    }
    
    public DistroKey getDistroKey() {
        return distroKey;
    }
    
    public void setDistroKey(DistroKey distroKey) {
        this.distroKey = distroKey;
    }
    
    public DataOperation getType() {
        return type;
    }
    
    public void setType(DataOperation type) {
        this.type = type;
    }
    
    public byte[] getContent() {
        return content;
    }
    
    public void setContent(byte[] content) {
        this.content = content;
    }
}

DistroKey

DistroData的key对象,可以包含较多的属性。

public class DistroKey {
    // 数据本身的key
    private String resourceKey;
    // 数据的类型
    private String resourceType;
    // 数据传输的目标服务器
    private String targetServer;
    
    public DistroKey() {
    }
    
    public DistroKey(String resourceKey, String resourceType) {
        this.resourceKey = resourceKey;
        this.resourceType = resourceType;
    }
    
    public DistroKey(String resourceKey, String resourceType, String targetServer) {
        this.resourceKey = resourceKey;
        this.resourceType = resourceType;
        this.targetServer = targetServer;
    }
    
    public String getResourceKey() {
        return resourceKey;
    }
    
    public void setResourceKey(String resourceKey) {
        this.resourceKey = resourceKey;
    }
    
    public String getResourceType() {
        return resourceType;
    }
    
    public void setResourceType(String resourceType) {
        this.resourceType = resourceType;
    }
    
    public String getTargetServer() {
        return targetServer;
    }
    
    public void setTargetServer(String targetServer) {
        this.targetServer = targetServer;
    }
    
    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        DistroKey distroKey = (DistroKey) o;
        return Objects.equals(resourceKey, distroKey.resourceKey) && Objects
                .equals(resourceType, distroKey.resourceType) && Objects.equals(targetServer, distroKey.targetServer);
    }
    
    @Override
    public int hashCode() {
        return Objects.hash(resourceKey, resourceType, targetServer);
    }
    
    @Override
    public String toString() {
        return "DistroKey{" + "resourceKey='" + resourceKey + '\'' + ", resourceType='" + resourceType + '\''
                + ", targetServer='" + targetServer + '\'' + '}';
    }
}

com.alibaba.nacos.core.distributed.distro.exception

com.alibaba.nacos.core.distributed.distro.task

DistroTaskEngineHolder

Distro任务引擎持有者,用于管理不同类型的任务执行引擎。

@Component
public class DistroTaskEngineHolder {
    // 延迟任务执行引擎
    private final DistroDelayTaskExecuteEngine delayTaskExecuteEngine = new DistroDelayTaskExecuteEngine();
    // 非延迟任务执行引擎
    private final DistroExecuteTaskExecuteEngine executeWorkersManager = new DistroExecuteTaskExecuteEngine();
    
    public DistroTaskEngineHolder(DistroComponentHolder distroComponentHolder) {
		// 为延迟任务执行引擎添加默认任务处理器
        DistroDelayTaskProcessor defaultDelayTaskProcessor = new DistroDelayTaskProcessor(this, distroComponentHolder);
        delayTaskExecuteEngine.setDefaultTaskProcessor(defaultDelayTaskProcessor);
    }
    
    public DistroDelayTaskExecuteEngine getDelayTaskExecuteEngine() {
        return delayTaskExecuteEngine;
    }
    
    public DistroExecuteTaskExecuteEngine getExecuteWorkersManager() {
        return executeWorkersManager;
    }
    
	/**
     * 为延迟任务添加默认任务处理器
     * @param key          处理器向容器保存时的key
     * @param nacosTaskProcessor 处理器对象
     */
    public void registerNacosTaskProcessor(Object key, NacosTaskProcessor nacosTaskProcessor) {
        this.delayTaskExecuteEngine.addProcessor(key, nacosTaskProcessor);
    }
}

com.alibaba.nacos.core.distributed.distro.task.delay

DistroDelayTask

Distro延迟任务

public class DistroDelayTask extends AbstractDelayTask {
    // 当前任务处理数据的key
    private final DistroKey distroKey;
    // 当前任务处理数据的操作类型
    private DataOperation action;
    // 当前任务创建的时间
    private long createTime;
    
    public DistroDelayTask(DistroKey distroKey, long delayTime) {
        this(distroKey, DataOperation.CHANGE, delayTime);
    }
    
	// 构造一个延迟任务
    public DistroDelayTask(DistroKey distroKey, DataOperation action, long delayTime) {
        this.distroKey = distroKey;
        this.action = action;
        this.createTime = System.currentTimeMillis();
		// 创建时设置上次处理的时间
        setLastProcessTime(createTime);
		// 设置间隔多久执行
        setTaskInterval(delayTime);
    }
    
    public DistroKey getDistroKey() {
        return distroKey;
    }
    
    public DataOperation getAction() {
        return action;
    }
    
    public long getCreateTime() {
        return createTime;
    }
    
    /**
     * 从字面意思是合并任务,实际的操作证明它是用于更新过时的任务
     * 在向任务列表添加新的任务时,使用新任务的key来从任务列表获取,若结果不为空,表明此任务已经存在
     * 相同的任务再次添加的话,就重复了,因此再此合并
     * 为什么新的任务会过时?(新任务指的是当前类)
     * 想要理解此处逻辑,请参考{@link com.alibaba.nacos.common.task.engine.NacosDelayTaskExecuteEngine#addTask(Object,
     *  AbstractDelayTask)}.添加任务时是带锁操作的。因此添加的先后顺序不能保证
     * @param task task 已存在的任务
     */
    @Override
    public void merge(AbstractDelayTask task) {
        if (!(task instanceof DistroDelayTask)) {
            return;
        }
        DistroDelayTask oldTask = (DistroDelayTask) task;
        // 若旧的任务和新的任务的操作类型不同,并且新任务的创建时间小于旧任务的创建时间,说明当前这个新任务还未被添加成功
        // 这个新的任务已经过时了,不需要再执行这个任务的操作,因此将旧的任务的操作类型和创建时间设置给新任务
        if (!action.equals(oldTask.getAction()) && createTime < oldTask.getCreateTime()) {
            action = oldTask.getAction();
            createTime = oldTask.getCreateTime();
        }
        setLastProcessTime(oldTask.getLastProcessTime());
    }
}

DistroDelayTaskExecuteEngine

延迟任务执行引擎

public class DistroDelayTaskExecuteEngine extends NacosDelayTaskExecuteEngine {
    
    public DistroDelayTaskExecuteEngine() {
        super(DistroDelayTaskExecuteEngine.class.getName(), Loggers.DISTRO);
    }
    
    @Override
    public void addProcessor(Object key, NacosTaskProcessor taskProcessor) {
		// 构建当前任务的key
        Object actualKey = getActualKey(key);
        super.addProcessor(actualKey, taskProcessor);
    }
    
    @Override
    public NacosTaskProcessor getProcessor(Object key) {
        Object actualKey = getActualKey(key);
        return super.getProcessor(actualKey);
    }
    
    private Object getActualKey(Object key) {
        return key instanceof DistroKey ? ((DistroKey) key).getResourceType() : key;
    }
}

DistroDelayTaskProcessor

延迟任务处理器

/**
 * Distro delay task processor.
 *
 * @author xiweng.yy
 */
public class DistroDelayTaskProcessor implements NacosTaskProcessor {
    // Distro任务引擎持有者
    private final DistroTaskEngineHolder distroTaskEngineHolder;
    // Distro组件持有者
    private final DistroComponentHolder distroComponentHolder;
    
    public DistroDelayTaskProcessor(DistroTaskEngineHolder distroTaskEngineHolder,
            DistroComponentHolder distroComponentHolder) {
        this.distroTaskEngineHolder = distroTaskEngineHolder;
        this.distroComponentHolder = distroComponentHolder;
    }
    
    @Override
    public boolean process(NacosTask task) {
		// 不处理非延迟任务
        if (!(task instanceof DistroDelayTask)) {
            return true;
        }
        DistroDelayTask distroDelayTask = (DistroDelayTask) task;
        DistroKey distroKey = distroDelayTask.getDistroKey();
		// 根据不同的操作类型创建具体的任务
        switch (distroDelayTask.getAction()) {
            case DELETE:
                DistroSyncDeleteTask syncDeleteTask = new DistroSyncDeleteTask(distroKey, distroComponentHolder);
                distroTaskEngineHolder.getExecuteWorkersManager().addTask(distroKey, syncDeleteTask);
                return true;
            case CHANGE:
            case ADD:
                DistroSyncChangeTask syncChangeTask = new DistroSyncChangeTask(distroKey, distroComponentHolder);
                distroTaskEngineHolder.getExecuteWorkersManager().addTask(distroKey, syncChangeTask);
                return true;
            default:
                return false;
        }
    }
}

com.alibaba.nacos.core.distributed.distro.task.execute

AbstractDistroExecuteTask

抽象的执行任务,定义了任务处理流程。

public abstract class AbstractDistroExecuteTask extends AbstractExecuteTask {
    
    private final DistroKey distroKey;
    
    private final DistroComponentHolder distroComponentHolder;
    
    protected AbstractDistroExecuteTask(DistroKey distroKey, DistroComponentHolder distroComponentHolder) {
        this.distroKey = distroKey;
        this.distroComponentHolder = distroComponentHolder;
    }
    
    protected DistroKey getDistroKey() {
        return distroKey;
    }
    
    protected DistroComponentHolder getDistroComponentHolder() {
        return distroComponentHolder;
    }
    
    @Override
    public void run() {
		// 获取被处理的数据资源类型
        String type = getDistroKey().getResourceType();
		// 根据类型获取数据传输代理
        DistroTransportAgent transportAgent = distroComponentHolder.findTransportAgent(type);
        if (null == transportAgent) {
            Loggers.DISTRO.warn("No found transport agent for type [{}]", type);
            return;
        }
        Loggers.DISTRO.info("[DISTRO-START] {}", toString());
		// 判断代理对象是否支持回调
        if (transportAgent.supportCallbackTransport()) {
            doExecuteWithCallback(new DistroExecuteCallback());
        } else {
            executeDistroTask();
        }
    }
    
	// 执行任务
    private void executeDistroTask() {
        try {
            boolean result = doExecute();
            if (!result) {
				// 执行失败之后,进行失败处理
                handleFailedTask();
            }
            Loggers.DISTRO.info("[DISTRO-END] {} result: {}", toString(), result);
        } catch (Exception e) {
            Loggers.DISTRO.warn("[DISTRO] Sync data change failed.", e);
			// 执行失败任务,进行失败处理
            handleFailedTask();
        }
    }
    
    /**
     * Get {@link DataOperation} for current task.
     *
     * @return data operation
     */
    protected abstract DataOperation getDataOperation();
    
    /**
     * Do execute for different sub class.
     *
     * @return result of execute
     */
    protected abstract boolean doExecute();
    
    /**
     * Do execute with callback for different sub class.
     *
     * @param callback callback
     */
    protected abstract void doExecuteWithCallback(DistroCallback callback);
    
    /**
     * Handle failed task.
	 * 处理失败的任务
     */
    protected void handleFailedTask() {
        String type = getDistroKey().getResourceType();
		// 使用失败任务处理器进行重试
        DistroFailedTaskHandler failedTaskHandler = distroComponentHolder.findFailedTaskHandler(type);
        if (null == failedTaskHandler) {
            Loggers.DISTRO.warn("[DISTRO] Can't find failed task for type {}, so discarded", type);
            return;
        }
        failedTaskHandler.retry(getDistroKey(), getDataOperation());
    }
    
    private class DistroExecuteCallback implements DistroCallback {
        
        @Override
        public void onSuccess() {
            Loggers.DISTRO.info("[DISTRO-END] {} result: true", getDistroKey().toString());
        }
        
        @Override
        public void onFailed(Throwable throwable) {
            if (null == throwable) {
                Loggers.DISTRO.info("[DISTRO-END] {} result: false", getDistroKey().toString());
            } else {
                Loggers.DISTRO.warn("[DISTRO] Sync data change failed.", throwable);
            }
            handleFailedTask();
        }
    }
}

DistroExecuteTaskExecuteEngine

Distro协议负责执行任务的执行引擎

package com.alibaba.nacos.common.task.engine;


public class DistroExecuteTaskExecuteEngine extends NacosExecuteTaskExecuteEngine {
    
	// 直接创建了一个新的NacosExecuteTaskExecuteEngine执行引擎
    public DistroExecuteTaskExecuteEngine() {
        super(DistroExecuteTaskExecuteEngine.class.getSimpleName(), Loggers.DISTRO);
    }
}

NacosExecuteTaskExecuteEngine

package com.alibaba.nacos.common.task.engine;

/**
 * Nacos execute task execute engine.
 * Nacos负责执行任务的执行引擎
 * @author xiweng.yy
 */
public class NacosExecuteTaskExecuteEngine extends AbstractNacosTaskExecuteEngine<AbstractExecuteTask> {
    
	// 任务执行者
    private final TaskExecuteWorker[] executeWorkers;
    
    public NacosExecuteTaskExecuteEngine(String name, Logger logger) {
		// 任务执行者的数量,取决于CPU的核数,默认为CPU核数的1.5~2倍,传递的参数是表示需要产生的线程数量是CPU核数的多少倍
        this(name, logger, ThreadUtils.getSuitableThreadCount(1));
    }
    
    public NacosExecuteTaskExecuteEngine(String name, Logger logger, int dispatchWorkerCount) {
        super(logger);
		// 创建一组任务执行者
        executeWorkers = new TaskExecuteWorker[dispatchWorkerCount];
        for (int mod = 0; mod < dispatchWorkerCount; ++mod) {
            executeWorkers[mod] = new TaskExecuteWorker(name, mod, dispatchWorkerCount, getEngineLog());
        }
    }
    
    @Override
    public int size() {
        int result = 0;
        for (TaskExecuteWorker each : executeWorkers) {
            result += each.pendingTaskCount();
        }
        return result;
    }
    
    @Override
    public boolean isEmpty() {
        return 0 == size();
    }
    
    @Override
    public void addTask(Object tag, AbstractExecuteTask task) {
		// 从父类获取任务处理器
        NacosTaskProcessor processor = getProcessor(tag);
		// 若存在处理器,则用处理器来处理
        if (null != processor) {
            processor.process(task);
            return;
        }
		// 不存在处理器则使用worker处理
        TaskExecuteWorker worker = getWorker(tag);
        worker.process(task);
    }
    
    private TaskExecuteWorker getWorker(Object tag) {
		// 计算当前任务应该由哪个worker处理
        int idx = (tag.hashCode() & Integer.MAX_VALUE) % workersCount();
        return executeWorkers[idx];
    }
    
    private int workersCount() {
        return executeWorkers.length;
    }
    
    @Override
    public AbstractExecuteTask removeTask(Object key) {
        throw new UnsupportedOperationException("ExecuteTaskEngine do not support remove task");
    }
    
    @Override
    public Collection<Object> getAllTaskKeys() {
        throw new UnsupportedOperationException("ExecuteTaskEngine do not support get all task keys");
    }
    
    @Override
    public void shutdown() throws NacosException {
        for (TaskExecuteWorker each : executeWorkers) {
            each.shutdown();
        }
    }
    
    /**
     * Get workers status.
     *
     * @return workers status string
     */
    public String workersStatus() {
        StringBuilder sb = new StringBuilder();
        for (TaskExecuteWorker worker : executeWorkers) {
            sb.append(worker.status()).append("\n");
        }
        return sb.toString();
    }
}

TaskExecuteWorker

package com.alibaba.nacos.common.task.engine;

/**
 * Nacos execute task execute worker.
 * Nacos任务执行者,每个执行者在创建的时候会同时启动一个线程InnerWorker,持续从内部队列中获取需要处理的任务
 * @author xiweng.yy
 */
public final class TaskExecuteWorker implements NacosTaskProcessor, Closeable {

    /**
     * Max task queue size 32768.
     * 队列最大数量为32768
     */
    private static final int QUEUE_CAPACITY = 1 << 15;

    private final Logger log;

    /**
     * 当前执行者线程的名称
     */
    private final String name;

    /**
     * 负责处理的线程队列
     */
    private final BlockingQueue<Runnable> queue;

    /**
     * 工作状态
     */
    private final AtomicBoolean closed;

    public TaskExecuteWorker(final String name, final int mod, final int total) {
        this(name, mod, total, null);
    }

    public TaskExecuteWorker(final String name, final int mod, final int total, final Logger logger) {
        /**
         * 执行线程的名称,以DistroExecuteTaskExecuteEngine举例:
         * DistroExecuteTaskExecuteEngine_0%8
         * DistroExecuteTaskExecuteEngine_1%8
         * DistroExecuteTaskExecuteEngine_2%8
         * DistroExecuteTaskExecuteEngine_3%8
         * DistroExecuteTaskExecuteEngine_4%8
         * DistroExecuteTaskExecuteEngine_5%8
         * DistroExecuteTaskExecuteEngine_6%8
         * DistroExecuteTaskExecuteEngine_7%8
         */
        this.name = name + "_" + mod + "%" + total;
        this.queue = new ArrayBlockingQueue<Runnable>(QUEUE_CAPACITY);
        this.closed = new AtomicBoolean(false);
        this.log = null == logger ? LoggerFactory.getLogger(TaskExecuteWorker.class) : logger;
        // 启动一个新线程来消费队列
        new InnerWorker(name).start();
    }

    public String getName() {
        return name;
    }

    @Override
    public boolean process(NacosTask task) {
        if (task instanceof AbstractExecuteTask) {
            putTask((Runnable) task);
        }
        return true;
    }

    private void putTask(Runnable task) {
        try {
            queue.put(task);
        } catch (InterruptedException ire) {
            log.error(ire.toString(), ire);
        }
    }

    public int pendingTaskCount() {
        return queue.size();
    }

    /**
     * Worker status.
     */
    public String status() {
        return name + ", pending tasks: " + pendingTaskCount();
    }

    @Override
    public void shutdown() throws NacosException {
        queue.clear();
        closed.compareAndSet(false, true);
    }

    /**
     * Inner execute worker.
     */
    private class InnerWorker extends Thread {

        InnerWorker(String name) {
            setDaemon(false);
            setName(name);
        }

        @Override
        public void run() {
            // 若线程还未中断,则持续执行
            while (!closed.get()) {
                try {
                    // 从队列获取任务
                    Runnable task = queue.take();
                    long begin = System.currentTimeMillis();
                    // 在当前InnerWorker线程内执行任务
                    task.run();
                    long duration = System.currentTimeMillis() - begin;
                    // 若任务执行时间超过1秒,则警告
                    if (duration > 1000L) {
                        log.warn("task {} takes {}ms", task, duration);
                    }
                } catch (Throwable e) {
                    log.error("[TASK-FAILED] " + e.toString(), e);
                }
            }
        }
    }
}

DistroSyncChangeTask

Distro同步变更任务,此任务用于向其他节点发送本机数据

public class DistroSyncChangeTask extends AbstractDistroExecuteTask {
    
	// 此任务操作类型为变更
    private static final DataOperation OPERATION = DataOperation.CHANGE;
    
    public DistroSyncChangeTask(DistroKey distroKey, DistroComponentHolder distroComponentHolder) {
        super(distroKey, distroComponentHolder);
    }
    
    @Override
    protected DataOperation getDataOperation() {
        return OPERATION;
    }
    
	/**
     * 执行不带回调的任务
     * @return
     */
    @Override
    protected boolean doExecute() {
		// 获取同步的数据类型
        String type = getDistroKey().getResourceType();
		// 获取同步数据
        DistroData distroData = getDistroData(type);
        if (null == distroData) {
            Loggers.DISTRO.warn("[DISTRO] {} with null data to sync, skip", toString());
            return true;
        }
		// 使用DistroTransportAgent同步数据
        return getDistroComponentHolder().findTransportAgent(type).syncData(distroData, getDistroKey().getTargetServer());
    }
    
	/**
     * 执行带回调的任务
     * @param callback callback
     */
    @Override
    protected void doExecuteWithCallback(DistroCallback callback) {
        String type = getDistroKey().getResourceType();
        DistroData distroData = getDistroData(type);
        if (null == distroData) {
            Loggers.DISTRO.warn("[DISTRO] {} with null data to sync, skip", toString());
            return;
        }
        getDistroComponentHolder().findTransportAgent(type).syncData(distroData, getDistroKey().getTargetServer(), callback);
    }
    
    @Override
    public String toString() {
        return "DistroSyncChangeTask for " + getDistroKey().toString();
    }
    
    private DistroData getDistroData(String type) {
        DistroData result = getDistroComponentHolder().findDataStorage(type).getDistroData(getDistroKey());
        if (null != result) {
            result.setType(OPERATION);
        }
        return result;
    }
}

DistroSyncDeleteTask

Distro同步删除任务,用于向其他节点发送本机删除的数据

public class DistroSyncDeleteTask extends AbstractDistroExecuteTask {
    
	// 此任务操作类型为删除
    private static final DataOperation OPERATION = DataOperation.DELETE;
    
    public DistroSyncDeleteTask(DistroKey distroKey, DistroComponentHolder distroComponentHolder) {
        super(distroKey, distroComponentHolder);
    }
    
    @Override
    protected DataOperation getDataOperation() {
        return OPERATION;
    }
    
	/**
     * 执行不带回调的任务
     * @return
     */
    @Override
    protected boolean doExecute() {
		// 构建请求参数
        String type = getDistroKey().getResourceType();
        DistroData distroData = new DistroData();
        distroData.setDistroKey(getDistroKey());
        distroData.setType(OPERATION);
		// 使用DistroTransportAgent同步数据
        return getDistroComponentHolder().findTransportAgent(type).syncData(distroData, getDistroKey().getTargetServer());
    }
    
	/**
     * 执行带回调的任务
     * @param callback callback
     */
    @Override
    protected void doExecuteWithCallback(DistroCallback callback) {
        String type = getDistroKey().getResourceType();
        DistroData distroData = new DistroData();
        distroData.setDistroKey(getDistroKey());
        distroData.setType(OPERATION);
        getDistroComponentHolder().findTransportAgent(type).syncData(distroData, getDistroKey().getTargetServer(), callback);
    }
    
    @Override
    public String toString() {
        return "DistroSyncDeleteTask for " + getDistroKey().toString();
    }
}

提示:

  • DistroSyncChangeTask是将本机所有的服务发送到其他节点
  • DistroSyncDeleteTask是将本机删除的服务发送到其他节点

com.alibaba.nacos.core.distributed.distro.task.load

DistroLoadDataTask

Distro全量数据同步任务,用于在节点启动后首次从其他节点同步服务数据到当前节点。

public class DistroLoadDataTask implements Runnable {

	// 节点管理器
    private final ServerMemberManager memberManager;
	// Distro协议组件持有者
    private final DistroComponentHolder distroComponentHolder;
	// Distro协议配置
    private final DistroConfig distroConfig;
	// 回调函数
    private final DistroCallback loadCallback;
	// 已加载数据集合
    private final Map<String, Boolean> loadCompletedMap;

    public DistroLoadDataTask(ServerMemberManager memberManager, DistroComponentHolder distroComponentHolder, DistroConfig distroConfig, DistroCallback loadCallback) {
        this.memberManager = memberManager;
        this.distroComponentHolder = distroComponentHolder;
        this.distroConfig = distroConfig;
        this.loadCallback = loadCallback;
        loadCompletedMap = new HashMap<>(1);
    }

    @Override
    public void run() {
        try {
			// 首次加载
            load();
			// 若首次加载没有完成,继续加载
            if (!checkCompleted()) {
				// 继续创建一个新的加载任务进行加载
                GlobalExecutor.submitLoadDataTask(this, distroConfig.getLoadDataRetryDelayMillis());
            } else {
				// 触发回调函数
                loadCallback.onSuccess();
                Loggers.DISTRO.info("[DISTRO-INIT] load snapshot data success");
            }
        } catch (Exception e) {
            loadCallback.onFailed(e);
            Loggers.DISTRO.error("[DISTRO-INIT] load snapshot data failed. ", e);
        }
    }

    private void load() throws Exception {
		// 若出自身之外没有其他节点,则休眠1秒,可能其他节点还未启动完毕
        while (memberManager.allMembersWithoutSelf().isEmpty()) {
            Loggers.DISTRO.info("[DISTRO-INIT] waiting server list init...");
            TimeUnit.SECONDS.sleep(1);
        }
		// 若数据类型为空,说明distroComponentHolder的组件注册器还未初始化完毕(v1版本为DistroHttpRegistry, v2版本为DistroClientComponentRegistry)
        while (distroComponentHolder.getDataStorageTypes().isEmpty()) {
            Loggers.DISTRO.info("[DISTRO-INIT] waiting distro data storage register...");
            TimeUnit.SECONDS.sleep(1);
        }
		// 加载每个类型的数据
        for (String each : distroComponentHolder.getDataStorageTypes()) {
            if (!loadCompletedMap.containsKey(each) || !loadCompletedMap.get(each)) {
				// 调用加载方法,并标记已处理
                loadCompletedMap.put(each, loadAllDataSnapshotFromRemote(each));
            }
        }
    }

	/**
     * 从其他节点获取同步数据
     * @param resourceType
     * @return
     */
    private boolean loadAllDataSnapshotFromRemote(String resourceType) {
		// 获取数据传输对象
        DistroTransportAgent transportAgent = distroComponentHolder.findTransportAgent(resourceType);
		// 获取数据处理器
        DistroDataProcessor dataProcessor = distroComponentHolder.findDataProcessor(resourceType);
        if (null == transportAgent || null == dataProcessor) {
            Loggers.DISTRO.warn("[DISTRO-INIT] Can't find component for type {}, transportAgent: {}, dataProcessor: {}", resourceType, transportAgent, dataProcessor);
            return false;
        }
		// 向每个节点请求数据
        for (Member each : memberManager.allMembersWithoutSelf()) {
            try {
                Loggers.DISTRO.info("[DISTRO-INIT] load snapshot {} from {}", resourceType, each.getAddress());
				// 获取到数据
                DistroData distroData = transportAgent.getDatumSnapshot(each.getAddress());
				// 解析数据
                boolean result = dataProcessor.processSnapshot(distroData);
                Loggers.DISTRO.info("[DISTRO-INIT] load snapshot {} from {} result: {}", resourceType, each.getAddress(), result);
				// 若解析成功,标记此类型数据已加载完毕
                if (result) {
                    distroComponentHolder.findDataStorage(resourceType).finishInitial();
                    return true;
                }
            } catch (Exception e) {
                Loggers.DISTRO.error("[DISTRO-INIT] load snapshot {} from {} failed.", resourceType, each.getAddress(), e);
            }
        }
        return false;
    }

	// 判断是否完成加载
    private boolean checkCompleted() {
		// 若待加载的数据类型数量和已经加载完毕的数据类型数量不一致,铁定是未加载完成
        if (distroComponentHolder.getDataStorageTypes().size() != loadCompletedMap.size()) {
            return false;
        }
		// 若加载完毕列表内的状态有false的,说明可能是解析失败,还需要重新加载
        for (Boolean each : loadCompletedMap.values()) {
            if (!each) {
                return false;
            }
        }
        return true;
    }
}

com.alibaba.nacos.core.distributed.distro.task.verify

DistroVerifyExecuteTask

Distro数据验证任务执行器,用于向其他节点发送当前节点负责的Client状态报告,通知对方此Client正常服务。它的数据处理维度是DistroData。

/**
 * Execute distro verify task.
 * 执行Distro协议数据验证的任务,为每个DistroData发送一个异步的rpc请求
 * @author xiweng.yy
 */
public class DistroVerifyExecuteTask extends AbstractExecuteTask {

    /**
     * 被验证数据的传输对象
     */
    private final DistroTransportAgent transportAgent;

    /**
     * 被验证数据
     */
    private final List<DistroData> verifyData;

    /**
     * 目标节点
     */
    private final String targetServer;

    /**
     * 被验证数据的类型
     */
    private final String resourceType;

    public DistroVerifyExecuteTask(DistroTransportAgent transportAgent, List<DistroData> verifyData,
            String targetServer, String resourceType) {
        this.transportAgent = transportAgent;
        this.verifyData = verifyData;
        this.targetServer = targetServer;
        this.resourceType = resourceType;
    }

    @Override
    public void run() {
        for (DistroData each : verifyData) {
            try {
                // 判断传输对象是否支持回调(若是http的则不支持,实际上没区别,当前2.0.1版本没有实现回调的实质内容)
                if (transportAgent.supportCallbackTransport()) {
                    doSyncVerifyDataWithCallback(each);
                } else {
                    doSyncVerifyData(each);
                }
            } catch (Exception e) {
                Loggers.DISTRO
                        .error("[DISTRO-FAILED] verify data for type {} to {} failed.", resourceType, targetServer, e);
            }
        }
    }

    /**
     * 支持回调的同步数据验证
     * @param data
     */
    private void doSyncVerifyDataWithCallback(DistroData data) {
        // 回调实际上,也没啥。。。基本算是空对象
        transportAgent.syncVerifyData(data, targetServer, new DistroVerifyCallback());
    }

    /**
     * 不支持回调的同步数据验证
     * @param data
     */
    private void doSyncVerifyData(DistroData data) {
        transportAgent.syncVerifyData(data, targetServer);
    }

    /**
     * TODO add verify monitor.
     */
    private class DistroVerifyCallback implements DistroCallback {

        @Override
        public void onSuccess() {
            if (Loggers.DISTRO.isDebugEnabled()) {
                Loggers.DISTRO.debug("[DISTRO] verify data for type {} to {} success", resourceType, targetServer);
            }
        }

        @Override
        public void onFailed(Throwable throwable) {
            if (Loggers.DISTRO.isDebugEnabled()) {
                Loggers.DISTRO
                        .debug("[DISTRO-FAILED] verify data for type {} to {} failed.", resourceType, targetServer,
                                throwable);
            }
        }
    }
}

DistroVerifyTimedTask

定时验证任务,此任务在启动时延迟5秒,间隔5秒执行。主要用于为每个节点创建一个数据验证的执行任务DistroVerifyExecuteTask。它的数据处理维度是Member。

/**
 * Timed to start distro verify task.
 * 启动Distro协议的数据验证流程
 * @author xiweng.yy
 */
public class DistroVerifyTimedTask implements Runnable {

    private final ServerMemberManager serverMemberManager;

    private final DistroComponentHolder distroComponentHolder;

    private final DistroExecuteTaskExecuteEngine executeTaskExecuteEngine;

    public DistroVerifyTimedTask(ServerMemberManager serverMemberManager, DistroComponentHolder distroComponentHolder,
            DistroExecuteTaskExecuteEngine executeTaskExecuteEngine) {
        this.serverMemberManager = serverMemberManager;
        this.distroComponentHolder = distroComponentHolder;
        this.executeTaskExecuteEngine = executeTaskExecuteEngine;
    }

    @Override
    public void run() {
        try {
            List<Member> targetServer = serverMemberManager.allMembersWithoutSelf();
            if (Loggers.DISTRO.isDebugEnabled()) {
                Loggers.DISTRO.debug("server list is: {}", targetServer);
            }
            for (String each : distroComponentHolder.getDataStorageTypes()) {
                verifyForDataStorage(each, targetServer);
            }
        } catch (Exception e) {
            Loggers.DISTRO.error("[DISTRO-FAILED] verify task failed.", e);
        }
    }

    private void verifyForDataStorage(String type, List<Member> targetServer) {
        DistroDataStorage dataStorage = distroComponentHolder.findDataStorage(type);
        if (!dataStorage.isFinishInitial()) {
            Loggers.DISTRO.warn("data storage {} has not finished initial step, do not send verify data",
                    dataStorage.getClass().getSimpleName());
            return;
        }
        List<DistroData> verifyData = dataStorage.getVerifyData();
        if (null == verifyData || verifyData.isEmpty()) {
            return;
        }
        for (Member member : targetServer) {
            DistroTransportAgent agent = distroComponentHolder.findTransportAgent(type);
            if (null == agent) {
                continue;
            }
            executeTaskExecuteEngine.addTask(member.getAddress() + type,
                    new DistroVerifyExecuteTask(agent, verifyData, member.getAddress(), type));
        }
    }
}

DistroConfig

Distro协议的配置信息。

public class DistroConfig {
    
    private static final DistroConfig INSTANCE = new DistroConfig();
    // 同步任务延迟时长(单位:毫秒)
    private long syncDelayMillis = DistroConstants.DEFAULT_DATA_SYNC_DELAY_MILLISECONDS;
    // 同步任务超时时长(单位:毫秒)
    private long syncTimeoutMillis = DistroConstants.DEFAULT_DATA_SYNC_TIMEOUT_MILLISECONDS;
    // 同步任务重试延迟时长(单位:毫秒)
    private long syncRetryDelayMillis = DistroConstants.DEFAULT_DATA_SYNC_RETRY_DELAY_MILLISECONDS;
    // 验证任务执行间隔时长(单位:毫秒)
    private long verifyIntervalMillis = DistroConstants.DEFAULT_DATA_VERIFY_INTERVAL_MILLISECONDS;
    // 验证任务超时时长(单位:毫秒)
    private long verifyTimeoutMillis = DistroConstants.DEFAULT_DATA_VERIFY_TIMEOUT_MILLISECONDS;
    // 首次同步数据重试延迟时长(单位:毫秒)
    private long loadDataRetryDelayMillis = DistroConstants.DEFAULT_DATA_LOAD_RETRY_DELAY_MILLISECONDS;
    
    private DistroConfig() {
        try {
			// 尝试从环境信息中获取配置
            getDistroConfigFromEnv();
        } catch (Exception e) {
            Loggers.CORE.warn("Get Distro config from env failed, will use default value", e);
        }
    }
    
	/**
     * 从环境信息中获取配置,若没有,则使用默认值
     */
    private void getDistroConfigFromEnv() {
		
		// 从常量对象中获取key和default value 
		
        syncDelayMillis = EnvUtil.getProperty(DistroConstants.DATA_SYNC_DELAY_MILLISECONDS, Long.class,
                DistroConstants.DEFAULT_DATA_SYNC_DELAY_MILLISECONDS);
        syncTimeoutMillis = EnvUtil.getProperty(DistroConstants.DATA_SYNC_TIMEOUT_MILLISECONDS, Long.class,
                DistroConstants.DEFAULT_DATA_SYNC_TIMEOUT_MILLISECONDS);
        syncRetryDelayMillis = EnvUtil.getProperty(DistroConstants.DATA_SYNC_RETRY_DELAY_MILLISECONDS, Long.class,
                DistroConstants.DEFAULT_DATA_SYNC_RETRY_DELAY_MILLISECONDS);
        verifyIntervalMillis = EnvUtil.getProperty(DistroConstants.DATA_VERIFY_INTERVAL_MILLISECONDS, Long.class,
                DistroConstants.DEFAULT_DATA_VERIFY_INTERVAL_MILLISECONDS);
        verifyTimeoutMillis = EnvUtil.getProperty(DistroConstants.DATA_VERIFY_TIMEOUT_MILLISECONDS, Long.class,
                DistroConstants.DEFAULT_DATA_VERIFY_TIMEOUT_MILLISECONDS);
        loadDataRetryDelayMillis = EnvUtil.getProperty(DistroConstants.DATA_LOAD_RETRY_DELAY_MILLISECONDS, Long.class,
                DistroConstants.DEFAULT_DATA_LOAD_RETRY_DELAY_MILLISECONDS);
    }
    
    public static DistroConfig getInstance() {
        return INSTANCE;
    }
    
    public long getSyncDelayMillis() {
        return syncDelayMillis;
    }
    
    public void setSyncDelayMillis(long syncDelayMillis) {
        this.syncDelayMillis = syncDelayMillis;
    }
    
    public long getSyncTimeoutMillis() {
        return syncTimeoutMillis;
    }
    
    public void setSyncTimeoutMillis(long syncTimeoutMillis) {
        this.syncTimeoutMillis = syncTimeoutMillis;
    }
    
    public long getSyncRetryDelayMillis() {
        return syncRetryDelayMillis;
    }
    
    public void setSyncRetryDelayMillis(long syncRetryDelayMillis) {
        this.syncRetryDelayMillis = syncRetryDelayMillis;
    }
    
    public long getVerifyIntervalMillis() {
        return verifyIntervalMillis;
    }
    
    public void setVerifyIntervalMillis(long verifyIntervalMillis) {
        this.verifyIntervalMillis = verifyIntervalMillis;
    }
    
    public long getVerifyTimeoutMillis() {
        return verifyTimeoutMillis;
    }
    
    public void setVerifyTimeoutMillis(long verifyTimeoutMillis) {
        this.verifyTimeoutMillis = verifyTimeoutMillis;
    }
    
    public long getLoadDataRetryDelayMillis() {
        return loadDataRetryDelayMillis;
    }
    
    public void setLoadDataRetryDelayMillis(long loadDataRetryDelayMillis) {
        this.loadDataRetryDelayMillis = loadDataRetryDelayMillis;
    }
}

DistroConstants

Distro常量配置,主要定义了一些关于任务执行时长的可配置的配置名称和对应的默认值。具体的使用,可以参考DistroConfig

public class DistroConstants {
    
    public static final String DATA_SYNC_DELAY_MILLISECONDS = "nacos.core.protocol.distro.data.sync.delayMs";
    
    public static final long DEFAULT_DATA_SYNC_DELAY_MILLISECONDS = 1000L;
    
    public static final String DATA_SYNC_TIMEOUT_MILLISECONDS = "nacos.core.protocol.distro.data.sync.timeoutMs";
    
    public static final long DEFAULT_DATA_SYNC_TIMEOUT_MILLISECONDS = 3000L;
    
    public static final String DATA_SYNC_RETRY_DELAY_MILLISECONDS = "nacos.core.protocol.distro.data.sync.retryDelayMs";
    
    public static final long DEFAULT_DATA_SYNC_RETRY_DELAY_MILLISECONDS = 3000L;
    
    public static final String DATA_VERIFY_INTERVAL_MILLISECONDS = "nacos.core.protocol.distro.data.verify.intervalMs";
    
    public static final long DEFAULT_DATA_VERIFY_INTERVAL_MILLISECONDS = 5000L;
    
    public static final String DATA_VERIFY_TIMEOUT_MILLISECONDS = "nacos.core.protocol.distro.data.verify.timeoutMs";
    
    public static final long DEFAULT_DATA_VERIFY_TIMEOUT_MILLISECONDS = 3000L;
    
    public static final String DATA_LOAD_RETRY_DELAY_MILLISECONDS = "nacos.core.protocol.distro.data.load.retryDelayMs";
    
    public static final long DEFAULT_DATA_LOAD_RETRY_DELAY_MILLISECONDS = 30000L;
    
}

DistroProtocol

Distro协议的真正入口,这里将使用上面定义的所有组件来共同完实现Distro协议。可以看到它使用了Spring的@Componet注解,意味着它将被Spring容器管理,执行到构造方法的时候将会启动Distro协议的工作。

@Component
public class DistroProtocol {

    private Logger logger = LoggerFactory.getLogger(DistroProtocol.class);

    /**
     * 节点管理器
     */
    private final ServerMemberManager memberManager;

    /**
     * Distro组件持有者
     */
    private final DistroComponentHolder distroComponentHolder;

    /**
     * Distro任务引擎持有者
     */
    private final DistroTaskEngineHolder distroTaskEngineHolder;

    private volatile boolean isInitialized = false;

    public DistroProtocol(ServerMemberManager memberManager, DistroComponentHolder distroComponentHolder,
            DistroTaskEngineHolder distroTaskEngineHolder) {
        this.memberManager = memberManager;
        this.distroComponentHolder = distroComponentHolder;
        this.distroTaskEngineHolder = distroTaskEngineHolder;
        // 启动Distro协议
        startDistroTask();
    }

    private void startDistroTask() {
        // 单机模式不进行数据同步操作
        if (EnvUtil.getStandaloneMode()) {
            isInitialized = true;
            return;
        }
        // 开启节点Client状态报告任务
        startVerifyTask();
        // 启动数据同步任务
        startLoadTask();
    }

    /**
     * 从其他节点获取数据到当前节点
     */
    private void startLoadTask() {
        DistroCallback loadCallback = new DistroCallback() {
            @Override
            public void onSuccess() {
                isInitialized = true;
            }

            @Override
            public void onFailed(Throwable throwable) {
                isInitialized = false;
            }
        };
        // 提交数据加载任务
        GlobalExecutor.submitLoadDataTask(new DistroLoadDataTask(memberManager, distroComponentHolder, DistroConfig.getInstance(), loadCallback));
    }

    private void startVerifyTask() {
        // 启动数据报告的定时任务
        GlobalExecutor.schedulePartitionDataTimedSync(
            new DistroVerifyTimedTask(
                memberManager,
                distroComponentHolder,
                distroTaskEngineHolder.getExecuteWorkersManager()
            ),
        DistroConfig.getInstance().getVerifyIntervalMillis());
    }

    public boolean isInitialized() {
        return isInitialized;
    }

    /**
     * Start to sync by configured delay.
     * 按配置的延迟开始同步
     * @param distroKey distro key of sync data
     * @param action    the action of data operation
     */
    public void sync(DistroKey distroKey, DataOperation action) {
        sync(distroKey, action, DistroConfig.getInstance().getSyncDelayMillis());
    }

    /**
     * Start to sync data to all remote server.
     * 开始将数据同步到其他节点
     * @param distroKey distro key of sync data
     * @param action    the action of data operation
     * @param delay     delay time for sync
     */
    public void sync(DistroKey distroKey, DataOperation action, long delay) {
        for (Member each : memberManager.allMembersWithoutSelf()) {
            syncToTarget(distroKey, action, each.getAddress(), delay);
        }
    }

    /**
     * Start to sync to target server.
     *
     * @param distroKey    distro key of sync data
     * @param action       the action of data operation
     * @param targetServer target server
     * @param delay        delay time for sync
     */
    public void syncToTarget(DistroKey distroKey, DataOperation action, String targetServer, long delay) {
        DistroKey distroKeyWithTarget = new DistroKey(distroKey.getResourceKey(), distroKey.getResourceType(), targetServer);
        DistroDelayTask distroDelayTask = new DistroDelayTask(distroKeyWithTarget, action, delay);
        distroTaskEngineHolder.getDelayTaskExecuteEngine().addTask(distroKeyWithTarget, distroDelayTask);
        if (Loggers.DISTRO.isDebugEnabled()) {
            Loggers.DISTRO.debug("[DISTRO-SCHEDULE] {} to {}", distroKey, targetServer);
        }
    }

    /**
     * Query data from specified server.
     * 从指定节点查询数据
     * @param distroKey data key
     * @return data
     */
    public DistroData queryFromRemote(DistroKey distroKey) {
        if (null == distroKey.getTargetServer()) {
            Loggers.DISTRO.warn("[DISTRO] Can't query data from empty server");
            return null;
        }
        String resourceType = distroKey.getResourceType();
        DistroTransportAgent transportAgent = distroComponentHolder.findTransportAgent(resourceType);
        if (null == transportAgent) {
            Loggers.DISTRO.warn("[DISTRO] Can't find transport agent for key {}", resourceType);
            return null;
        }
        return transportAgent.getData(distroKey, distroKey.getTargetServer());
    }

    /**
     * Receive synced distro data, find processor to process.
     * 接收到同步数据,并查找处理器进行处理
     * @param distroData Received data
     * @return true if handle receive data successfully, otherwise false
     */
    public boolean onReceive(DistroData distroData) {
        Loggers.DISTRO.info("[DISTRO] Receive distro data type: {}, key: {}", distroData.getType(),
                distroData.getDistroKey());
        String resourceType = distroData.getDistroKey().getResourceType();
        DistroDataProcessor dataProcessor = distroComponentHolder.findDataProcessor(resourceType);
        if (null == dataProcessor) {
            Loggers.DISTRO.warn("[DISTRO] Can't find data process for received data {}", resourceType);
            return false;
        }
        return dataProcessor.processData(distroData);
    }

    /**
     * Receive verify data, find processor to process.
     * 接收到验证数据,并查找处理器进行处理
     * @param distroData    verify data
     * @param sourceAddress source server address, might be get data from source server
     * @return true if verify data successfully, otherwise false
     */
    public boolean onVerify(DistroData distroData, String sourceAddress) {
        if (Loggers.DISTRO.isDebugEnabled()) {
            Loggers.DISTRO.debug("[DISTRO] Receive verify data type: {}, key: {}", distroData.getType(), distroData.getDistroKey());
        }
        String resourceType = distroData.getDistroKey().getResourceType();
        DistroDataProcessor dataProcessor = distroComponentHolder.findDataProcessor(resourceType);
        if (null == dataProcessor) {
            Loggers.DISTRO.warn("[DISTRO] Can't find verify data process for received data {}", resourceType);
            return false;
        }
        return dataProcessor.processVerifyData(distroData, sourceAddress);
    }

    /**
     * Query data of input distro key.
     * 根据条件查询数据
     * @param distroKey key of data
     * @return data
     */
    public DistroData onQuery(DistroKey distroKey) {
        String resourceType = distroKey.getResourceType();
        DistroDataStorage distroDataStorage = distroComponentHolder.findDataStorage(resourceType);
        if (null == distroDataStorage) {
            Loggers.DISTRO.warn("[DISTRO] Can't find data storage for received key {}", resourceType);
            return new DistroData(distroKey, new byte[0]);
        }
        return distroDataStorage.getDistroData(distroKey);
    }

    /**
     * Query all datum snapshot.
     * 查询所有快照数据
     * @param type datum type
     * @return all datum snapshot
     */
    public DistroData onSnapshot(String type) {
        DistroDataStorage distroDataStorage = distroComponentHolder.findDataStorage(type);
        if (null == distroDataStorage) {
            Loggers.DISTRO.warn("[DISTRO] Can't find data storage for received key {}", type);
            return new DistroData(new DistroKey("snapshot", type), new byte[0]);
        }
        return distroDataStorage.getDatumSnapshot();
    }
}

如果您认真从头看到这里,相信您脑海中会记住一些关键字,比如TasksyncprocessorDistroData。所谓的Distro协议,不就是同步数据嘛,没错,它就是同步数据,在多个节点之间同步数据。通过DistroProtocol这个类不难发现,它实现了定时向其他节点报告状态、首次从其他节点加载数据、同步数据到指定节点、获取当前节点的快照数据。将这些功能组合在一起便可以实现多节点同步。因为所有节点都会做这些操作。

Distro协议数据对象

在整个交互过程中,是使用DistroData对象作为数据载体,它可以保存多种操作类型的任意数据。结构如图:

在DistroKey中,包含了资源的标识、资源的类型,以及该资源所属的节点。因此任何DistroData数据都能够确定它是来自于那台机器的什么类型的数据,在DataOperation中则定义了该数据将被用于什么操作。至于真正的数据类型,字节数组保证了它的兼容性,实际上DistroKey和Operation也能确定它将会是什么类型。

Distro协议重要角色

我们知道DistroData是作为Distro协议的交互对象,剩下的还有负责保存数据的组件、处理数据的组件、发送数据的组件,它们共同协作来完成整个协议流程。

存储DistroData

DistroDataStorage 用于保存DistroData, 它有多种实现,用于处理不同类型的数据。实际上就是处理不同版本中的数据。

  • v1版本的实现:DistroDataStorageImpl
  • v2版本的实现:DistroClientDataProcessor

提示:
后续将不再刻意提及v1或者是v2的实现,默认以v2实现来分析。

数据的获取发生在DistroDataStorage接口的getDistroData(DistroKey distroKey)getDatumSnapshot()getVerifyData()三个方法中。在v2版本中DistroClientDataProcessor实现了DistroDataStorage接口,提供DistroData的获取功能。

// DistroClientDataProcessor.java

@Override
public DistroData getDistroData(DistroKey distroKey) {
	// 从Client管理器中获取指定Client
	Client client = clientManager.getClient(distroKey.getResourceKey());
	if (null == client) {
		return null;
	}
	byte[] data = ApplicationUtils.getBean(Serializer.class).serialize(client.generateSyncData());
	return new DistroData(distroKey, data);
}

@Override
public DistroData getDatumSnapshot() {
	List<ClientSyncData> datum = new LinkedList<>();
	// 从Client管理器中获取所有Client
	for (String each : clientManager.allClientId()) {
		Client client = clientManager.getClient(each);
		if (null == client || !client.isEphemeral()) {
			continue;
		}
		datum.add(client.generateSyncData());
	}
	ClientSyncDatumSnapshot snapshot = new ClientSyncDatumSnapshot();
	snapshot.setClientSyncDataList(datum);
	byte[] data = ApplicationUtils.getBean(Serializer.class).serialize(snapshot);
	return new DistroData(new DistroKey(DataOperation.SNAPSHOT.name(), TYPE), data);
}

@Override
public List<DistroData> getVerifyData() {
	List<DistroData> result = new LinkedList<>();
	// 从Client管理器中获取所有Client
	for (String each : clientManager.allClientId()) {
		Client client = clientManager.getClient(each);
		if (null == client || !client.isEphemeral()) {
			continue;
		}
		if (clientManager.isResponsibleClient(client)) {
			// TODO add revision for client.
			DistroClientVerifyInfo verifyData = new DistroClientVerifyInfo(client.getClientId(), 0);
			DistroKey distroKey = new DistroKey(client.getClientId(), TYPE);
			DistroData data = new DistroData(distroKey,
					ApplicationUtils.getBean(Serializer.class).serialize(verifyData));
			data.setType(DataOperation.VERIFY);
			result.add(data);
		}
	}
	return result;
}

通过v2版本的数据存储实现可以发现,它并没有直接去保存数据,而是从ClientManager内部获取。

处理DistroData

DistroDataProcessor 用于处理DistroData。数据的处理发生在processData(DistroData distroData)processVerifyData(DistroData distroData, String sourceAddress)processSnapshot(DistroData distroData)三个方法中。在v2版本中DistroClientDataProcessor实现了DistroDataProcessor接口,提供DistroData的处理能力。

// DistroClientDataProcessor.java

@Override
public boolean processData(DistroData distroData) {
	switch (distroData.getType()) {
		case ADD:
		case CHANGE:
			ClientSyncData clientSyncData = ApplicationUtils.getBean(Serializer.class).deserialize(distroData.getContent(), ClientSyncData.class);
			handlerClientSyncData(clientSyncData);
			return true;
		case DELETE:
			String deleteClientId = distroData.getDistroKey().getResourceKey();
			Loggers.DISTRO.info("[Client-Delete] Received distro client sync data {}", deleteClientId);
			clientManager.clientDisconnected(deleteClientId);
			return true;
		default:
			return false;
	}
}

@Override
public boolean processVerifyData(DistroData distroData, String sourceAddress) {
	DistroClientVerifyInfo verifyData = ApplicationUtils.getBean(Serializer.class).deserialize(distroData.getContent(), DistroClientVerifyInfo.class);
	if (clientManager.verifyClient(verifyData.getClientId())) {
		return true;
	}
	Loggers.DISTRO.info("client {} is invalid, get new client from {}", verifyData.getClientId(), sourceAddress);
	return false;
}

@Override
public boolean processSnapshot(DistroData distroData) {
	ClientSyncDatumSnapshot snapshot = ApplicationUtils.getBean(Serializer.class).deserialize(distroData.getContent(), ClientSyncDatumSnapshot.class);
	for (ClientSyncData each : snapshot.getClientSyncDataList()) {
		handlerClientSyncData(each);
	}
	return true;
}

发送DistroData

DistroTransportAgent用于传输DistroData,v2版本中DistroClientTransportAgent实现了DistroTransportAgent接口,提供DistroData的发送能力。

/**
 * Distro transport agent for v2.
 * v2版本的DistroData传输代理
 * @author xiweng.yy
 */
public class DistroClientTransportAgent implements DistroTransportAgent {

    private final ClusterRpcClientProxy clusterRpcClientProxy;

    private final ServerMemberManager memberManager;

    public DistroClientTransportAgent(ClusterRpcClientProxy clusterRpcClientProxy,
            ServerMemberManager serverMemberManager) {
        this.clusterRpcClientProxy = clusterRpcClientProxy;
        this.memberManager = serverMemberManager;
    }

    /**
     * 当前实现支持回调
     * @return
     */
    @Override
    public boolean supportCallbackTransport() {
        return true;
    }

    /**
     * 向指定节点发送同步数据
     * @param data         data
     * @param targetServer target server
     * @return
     */
    @Override
    public boolean syncData(DistroData data, String targetServer) {
        if (isNoExistTarget(targetServer)) {
            return true;
        }
        DistroDataRequest request = new DistroDataRequest(data, data.getType());
        Member member = memberManager.find(targetServer);
        if (checkTargetServerStatusUnhealthy(member)) {
            Loggers.DISTRO.warn("[DISTRO] Cancel distro sync caused by target server {} unhealthy", targetServer);
            return false;
        }
        try {
            Response response = clusterRpcClientProxy.sendRequest(member, request);
            return checkResponse(response);
        } catch (NacosException e) {
            Loggers.DISTRO.error("[DISTRO-FAILED] Sync distro data failed! ", e);
        }
        return false;
    }

    /**
     * 向指定节点发送回同步数据(支持回调)
     * @param data         data
     * @param targetServer target server
     * @param callback     callback
     */
    @Override
    public void syncData(DistroData data, String targetServer, DistroCallback callback) {
        if (isNoExistTarget(targetServer)) {
            callback.onSuccess();
        }
        DistroDataRequest request = new DistroDataRequest(data, data.getType());
        Member member = memberManager.find(targetServer);
        try {
            clusterRpcClientProxy.asyncRequest(member, request, new DistroRpcCallbackWrapper(callback, member));
        } catch (NacosException nacosException) {
            callback.onFailed(nacosException);
        }
    }

    /**
     * 向指定节点发送验证数据
     * @param verifyData   verify data
     * @param targetServer target server
     * @return
     */
    @Override
    public boolean syncVerifyData(DistroData verifyData, String targetServer) {
        if (isNoExistTarget(targetServer)) {
            return true;
        }
        // replace target server as self server so that can callback.
        verifyData.getDistroKey().setTargetServer(memberManager.getSelf().getAddress());
        DistroDataRequest request = new DistroDataRequest(verifyData, DataOperation.VERIFY);
        Member member = memberManager.find(targetServer);
        if (checkTargetServerStatusUnhealthy(member)) {
            Loggers.DISTRO.warn("[DISTRO] Cancel distro verify caused by target server {} unhealthy", targetServer);
            return false;
        }
        try {
            Response response = clusterRpcClientProxy.sendRequest(member, request);
            return checkResponse(response);
        } catch (NacosException e) {
            Loggers.DISTRO.error("[DISTRO-FAILED] Verify distro data failed! ", e);
        }
        return false;
    }

    /**
     * 向指定节点发送验证数据(支持回调)
     * @param verifyData   verify data
     * @param targetServer target server
     * @param callback     callback
     */
    @Override
    public void syncVerifyData(DistroData verifyData, String targetServer, DistroCallback callback) {
        // 若此节点不在当前节点缓存中,直接返回,因为可能下线、或者过期,不需要验证了
        if (isNoExistTarget(targetServer)) {
            callback.onSuccess();
        }
        // 构建请求对象
        DistroDataRequest request = new DistroDataRequest(verifyData, DataOperation.VERIFY);
        Member member = memberManager.find(targetServer);
        try {
            // 创建一个回调对象(Wrapper实现了RequestCallBack接口)
            DistroVerifyCallbackWrapper wrapper = new DistroVerifyCallbackWrapper(targetServer,
                    verifyData.getDistroKey().getResourceKey(), callback, member);
            // 使用集群Rpc请求对象发送异步任务
            clusterRpcClientProxy.asyncRequest(member, request, wrapper);
        } catch (NacosException nacosException) {
            callback.onFailed(nacosException);
        }
    }

    /**
     * 从指定节点获取数据
     * @param key          key of data
     * @param targetServer target server
     * @return
     */
    @Override
    public DistroData getData(DistroKey key, String targetServer) {
        Member member = memberManager.find(targetServer);
        if (checkTargetServerStatusUnhealthy(member)) {
            throw new DistroException(
                    String.format("[DISTRO] Cancel get snapshot caused by target server %s unhealthy", targetServer));
        }
        DistroDataRequest request = new DistroDataRequest();
        DistroData distroData = new DistroData();
        distroData.setDistroKey(key);
        distroData.setType(DataOperation.QUERY);
        request.setDistroData(distroData);
        request.setDataOperation(DataOperation.QUERY);
        try {
            Response response = clusterRpcClientProxy.sendRequest(member, request);
            if (checkResponse(response)) {
                return ((DistroDataResponse) response).getDistroData();
            } else {
                throw new DistroException(
                        String.format("[DISTRO-FAILED] Get data request to %s failed, code: %d, message: %s",
                                targetServer, response.getErrorCode(), response.getMessage()));
            }
        } catch (NacosException e) {
            throw new DistroException("[DISTRO-FAILED] Get distro data failed! ", e);
        }
    }

    /**
     * 从指定节点获取快照数据
     * @param targetServer target server.
     * @return
     */
    @Override
    public DistroData getDatumSnapshot(String targetServer) {
        Member member = memberManager.find(targetServer);
        if (checkTargetServerStatusUnhealthy(member)) {
            throw new DistroException(
                    String.format("[DISTRO] Cancel get snapshot caused by target server %s unhealthy", targetServer));
        }
        DistroDataRequest request = new DistroDataRequest();
        request.setDataOperation(DataOperation.SNAPSHOT);
        try {
            Response response = clusterRpcClientProxy.sendRequest(member, request);
            if (checkResponse(response)) {
                return ((DistroDataResponse) response).getDistroData();
            } else {
                throw new DistroException(
                        String.format("[DISTRO-FAILED] Get snapshot request to %s failed, code: %d, message: %s",
                                targetServer, response.getErrorCode(), response.getMessage()));
            }
        } catch (NacosException e) {
            throw new DistroException("[DISTRO-FAILED] Get distro snapshot failed! ", e);
        }
    }

    private boolean isNoExistTarget(String target) {
        return !memberManager.hasMember(target);
    }

    private boolean checkTargetServerStatusUnhealthy(Member member) {
        return null == member || !NodeState.UP.equals(member.getState());
    }

    private boolean checkResponse(Response response) {
        return ResponseCode.SUCCESS.getCode() == response.getResultCode();
    }

    /**
     * rpc请求回调包装器
     */
    private class DistroRpcCallbackWrapper implements RequestCallBack<Response> {

        private final DistroCallback distroCallback;

        private final Member member;

        public DistroRpcCallbackWrapper(DistroCallback distroCallback, Member member) {
            this.distroCallback = distroCallback;
            this.member = member;
        }

        @Override
        public Executor getExecutor() {
            return GlobalExecutor.getCallbackExecutor();
        }

        @Override
        public long getTimeout() {
            return DistroConfig.getInstance().getSyncTimeoutMillis();
        }

        @Override
        public void onResponse(Response response) {
            if (checkResponse(response)) {
                NamingTpsMonitor.distroSyncSuccess(member.getAddress(), member.getIp());
                distroCallback.onSuccess();
            } else {
                NamingTpsMonitor.distroSyncFail(member.getAddress(), member.getIp());
                distroCallback.onFailed(null);
            }
        }

        @Override
        public void onException(Throwable e) {
            distroCallback.onFailed(e);
        }
    }

    /**
     * 验证数据回调包装器
     */
    private class DistroVerifyCallbackWrapper implements RequestCallBack<Response> {

        private final String targetServer;

        private final String clientId;

        private final DistroCallback distroCallback;

        private final Member member;

        private DistroVerifyCallbackWrapper(String targetServer, String clientId, DistroCallback distroCallback,
                Member member) {
            this.targetServer = targetServer;
            this.clientId = clientId;
            this.distroCallback = distroCallback;
            this.member = member;
        }

        @Override
        public Executor getExecutor() {
            return GlobalExecutor.getCallbackExecutor();
        }

        @Override
        public long getTimeout() {
            return DistroConfig.getInstance().getVerifyTimeoutMillis();
        }

        @Override
        public void onResponse(Response response) {
            if (checkResponse(response)) {
                NamingTpsMonitor.distroVerifySuccess(member.getAddress(), member.getIp());
                distroCallback.onSuccess();
            } else {
                Loggers.DISTRO.info("Target {} verify client {} failed, sync new client", targetServer, clientId);
				// 验证失败之后发布事件
                NotifyCenter.publishEvent(new ClientEvent.ClientVerifyFailedEvent(clientId, targetServer));
                NamingTpsMonitor.distroVerifyFail(member.getAddress(), member.getIp());
                distroCallback.onFailed(null);
            }
        }

        @Override
        public void onException(Throwable e) {
            distroCallback.onFailed(e);
        }
    }
}


不管同步数据的操作类型是什么,最终发送数据使用的是ClusterRpcClientProxy对象。

以上3个组件是实现Distro协议中重要的一环,后续关于Distro协议的逻辑将全部围绕这三个组件进行

posted @ 2021-07-22 00:44  不会发芽的种子  阅读(3152)  评论(0编辑  收藏  举报