dremio 的 telemetry 处理简单说明

dremio 实际上添加了支持基于opentelemetry 的监控处理,代码处理上官方是已经支持了metrics ,当然也是支持trace的

参考处理机制

处理上基于了配置,同时利用了jackson 多太类型处理,以及基于spi模块加载机制

  • 配置加载

包装了TelemetryConfigurator 类,同时还支持reload

public class TelemetryConfigurator implements AutoCloseable {
 
  private final List<MetricsConfigurator> metricsConfigs;
  private final RefreshConfiguration refreshConfig;
  private final TracerConfigurator traceConfig;
 // 可以看出同时支持metrics 以及trace 
  @JsonCreator
  public TelemetryConfigurator(
    @JsonProperty("auto-reload") RefreshConfiguration refreshConfig,
    @JsonProperty("metrics") List<MetricsConfigurator> metricsConfigs,
    @JsonProperty("tracing") TracerConfigurator traceConfig
  ) {
    super();
    this.refreshConfig = refreshConfig;
    this.metricsConfigs = metricsConfigs;
    this.traceConfig = traceConfig;
  }

我们一般metrics 的配置如下

metrics:
  - name: prometheus_reporter
    comment: >
      Publish metrics on prometheus
    reporter:
    type: prometheus  // 多太类型处理基于type 区分
    port: 12543
  • 提供的核心能力

services/telemetry-api 此模块包装了标准的接口定义,包含了配置,metrics,trace,对于metrics 的开发官方开发了一个ReporterConfigurator 抽象定义
方便扩展,默认实现如下(prometheus 的也是扩展了此抽象类,可以参考github 源码),核心是需要实现configureAndStart

同时dremio 为了方便使用自己包装了不少metrics 定义,如下

 

 


trace 处理上主要还是对于http 的处理,支持了opentelemetry 以及jaeger
比如prometheus 的处理

 
 // prometheus 就是上边配置type 的名称
@JsonTypeName("prometheus")
public class PrometheusConfigurator extends ReporterConfigurator {
 
    private final int port;
    private volatile HTTPServer server;
 
    @JsonCreator
    public PrometheusConfigurator(@JsonProperty("port") int port) {
        super();
        this.port = (port > 0) ? port : Integer.parseInt(System.getProperty("dremio.prometheus.port", "12543"));
 
    }
   // 此处进行prometheus 的服务启动
    @Override
    public void configureAndStart(String name, MetricRegistry registry, MetricFilter filter) {
        CollectorRegistry.defaultRegistry.register(new DropwizardExports(registry));//todo filter and rates maybe switch to pushgateway
        try {
            server = new HTTPServer(port);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }
 
    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (o == null || getClass() != o.getClass()) {
            return false;
        }
        PrometheusConfigurator that = (PrometheusConfigurator) o;
        return port == that.port;
    }
 
    @Override
    public int hashCode() {
        return Objects.hashCode(port);
    }
 
    @Override
    public void close() {
        if (server != null) {
            server.stop();
        }
    }
 
    /**
     * Module that may be added to a jackson object mapper
     * so it can parse jmx config.  jackson 模块,通过spi 加载
     */
    public static class Module extends ConfigModule {
        @Override
        public void setupModule(com.fasterxml.jackson.databind.Module.SetupContext context) {
            context.registerSubtypes(PrometheusConfigurator.class);
        }
    }
}

spi 配置定义(方便jackson 的模块注册处理)

 

 


spi 加载模块
Telemetry 类中,核心是ObjectMapper 的构建在startTelemetry 方法中

 
 final ObjectMapper mapper = new ObjectMapper(new YAMLFactory());
    ImmutableList.Builder<ConfigModule> modulesBuilder = ImmutableList.builder();
    // spi 加载多太类型配置
    ServiceLoader.load(ConfigModule.class).forEach(modulesBuilder::add);
    ImmutableList<ConfigModule> modules = modulesBuilder.build();
 
    if (modules.isEmpty()) {
      logger.warn("Unable to discover any modules for telemetry config. Will not refresh config.");
      return;
    }
    mapper.registerModules(modules);

metrics 任务的启动

dremio 在处理这部分的时候核心还是放在了Metrics 类中,使用了ReporterManager 通过onChange 方法对于metrics 的运行

public synchronized void onChange(Collection<MetricsConfigurator> newConfs) {
    try {
      final Set<MetricsConfigurator> current = new HashSet<>(configurators);
      final Set<MetricsConfigurator> toStart = new HashSet<>();
      final Set<MetricsConfigurator> totalSet = new HashSet<>();
      for(MetricsConfigurator c : newConfs) {
        if (current.remove(c)) {
          totalSet.add(c);
        } else {
          toStart.add(c);
          totalSet.add(c);
        }
      }
 
      for(MetricsConfigurator c : toStart) {
        try {
         // 此处启动metrics 服务模块
          c.start(registry);
        } catch (Exception ex) {
          logger.error("Failure while starting configuration {} metric reporter.", c.getName(), ex);
 
          // don't save this as something running. instead, try again on next refresh.
          totalSet.remove(c);
        }
      }

Telemetry 服务的启动

在BootstrapContext 中处理的
如下

 
  // 构造函数中就进行Telemetry服务的启动
  public BootStrapContext(DremioConfig config, ScanResult classpathScan, SingletonRegistry registry) {
    registry.bind(Tracer.class, TracerFacade.INSTANCE);
    registry.bind(GrpcTracerFacade.class, new GrpcTracerFacade(TracerFacade.INSTANCE));
    Telemetry.startTelemetry();
 
    this.config = config.getSabotConfig();
    this.classpathScan = classpathScan;
    this.allocator = RootAllocatorFactory.newRoot(config);
    this.executor = new ContextMigratingCloseableExecutorService<>(new CloseableThreadPool("dremio-general-"), registry.lookup(Tracer.class));
    this.lpPersistance = new LogicalPlanPersistence(config.getSabotConfig(), classpathScan);
    this.dremioConfig = config;
 
    registerMetrics();
 
    this.nodeDebugContextProvider = (allocator instanceof DremioRootAllocator)
      ? new NodeDebugContextProviderImpl((DremioRootAllocator) allocator)
      : NodeDebugContextProvider.NOOP;
  }

实际BootstrapContext 的处理
DACDaemonModule 类的bootstrap 方法中

 
final BootStrapContext bootStrapContext = new BootStrapContext(config, scanResult, bootstrapRegistry);

说明

dremio metrics 部分还是比较完整的,而且官方提供了文档说明,但是关于trace 就比较少了,但是通过源码阅读我们还是可以了解集成的模式以及如果进行配置
实际上与metrics 是类似的,具体可以参考实现的配置类,dremio 基于jackson 多太处理模式使用是比较多的,是一个很不错的实践,很值得学习参考

参考资料

services/telemetry-impl
services/telemetry-api/src/main/java/com/dremio/telemetry/api/Telemetry.java
services/telemetry-api/src/main/java/com/dremio/telemetry/api/Telemetry.java
sabot/kernel/src/main/java/com/dremio/exec/server/BootStrapContext.java
dac/backend/src/main/java/com/dremio/dac/daemon/DACDaemonModule.java
dac/backend/src/main/java/com/dremio/dac/util/JSONUtil.java
services/telemetry-api/src/main/java/com/dremio/telemetry/api/metrics/Metrics.java
services/telemetry-impl/src/main/java/com/dremio/telemetry/impl/config/tracing/OpenTelemetryConfigurator.java
services/telemetry-impl/src/main/java/com/dremio/telemetry/impl/config/tracing/JaegerConfigurator.java
https://docs.dremio.com/software/advanced-administration/telemetry/
https://github.com/dremio-hub/dremio-prometheus-exporter

posted on 2023-01-21 22:57  荣锋亮  阅读(59)  评论(0编辑  收藏  举报

导航