Sleuth
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-sleuth</artifactId>
</dependency>
集成了Sleuth之后会自动看到效果,调用之间请求的Header已经注入TraceId,SpanId。
经常的操作是
-
整合ELK,增加依赖让logstash输出json日志,之后存储在ElasticSearch,并用Kibana展示
-
整合Zipkin将调用链数据上报给Zipkin
这里列几个需要注意的点:
抽样采集数据
调用链信息是否上报给Zipkin是可以设置一定的比例的,也就是抽样比例。因为在高并发下,如果所有数据都采集,那这个数据量就太大了,采用抽样的做法可以减少一部分数据量,特别是对于Http方式去发送采集数据,对性能有很大的影响。
#zipkin 抽样比例
spring.sleuth.sampler.probability=1.0
异步任务线程池定义
Sleuth对异步任务也是支持的,我们用@Async开启一个异步任务后,Sleuth会为这个调用新创建一个Span。
如果你自定义了异步任务的线程池,会导致无法新创建一个Span,这就要使用Sleuth提供的LazyTraceExecutor来包装下。
// Sleuth异步线程池配置
@Configuration
@EnableAutoConfiguration
public class CustomExecutorConfig extends AsyncConfigurerSupport {
@Autowired BeanFactory beanFactory;
@Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(7);
executor.setMaxPoolSize(42);
executor.setQueueCapacity(11);
executor.setThreadNamePrefix("traceable-thread-");
executor.initialize();
return new LazyTraceExecutor(this.beanFactory, executor);
}
}
TracingFilter
TracingFilter是负责处理请求和响应的组件,我们可以通过注册自定义的TracingFilter实例来实现一些扩展性的需求。
演示下如何给请求添加自定义的标记以及将请求ID添加到响应头返回给客户端。
@Component
@Order(TraceWebServletAutoConfiguration.TRACING_FILTER_ORDER + 1)
class MyFilter extends GenericFilterBean {
private final Tracer tracer;
MyFilter(Tracer tracer) {
this.tracer = tracer;
}
@Override
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
Span currentSpan = this.tracer.currentSpan();
if (currentSpan == null) {
chain.doFilter(request, response);
return;
}
((HttpServletResponse) response).addHeader("ZIPKIN-TRACE-ID",
currentSpan.context().traceIdString());
currentSpan.tag("custom", "tag");
chain.doFilter(request, response);
}
}
在响应头中设置了请求ID,可以通过查看请求的响应信息来验证是否设置成功
手动创建的标记可以在Zipkin中查看
自定义标记是一个非常实用的功能,可以将请求对应的用户信息标记上去,排查问题时非常有帮助。
埋点
异步执行和远程调用都会新开启一个Span,如果我们想监控本地的方法耗时时间,可以采用埋点的方式监控本地方法,也就是开启一个新的Span。
@Autowired
Tracer tracer;
@Override
public void saveLog2(String log) {
ScopedSpan span = tracer.startScopedSpan("saveLog2");
try {
Thread.sleep(2000);
} catch (Exception | Error e) {
span.error(e);
} finally {
span.finish();
}
}
通过手动埋点的方式可以创建新的Span,在Zipkin的UI中也可以看到这个本地方法执行所消耗的时间,可以看到savelog2花费了2秒的时间。
除了使用代码手动创建Span,还有一种更简单的方式,那就是在方法上加上下面的注解:
@NewSpan(name = "saveLog2")
过滤不想追踪的请求
对于某些请求不想开启跟踪,可以通过配置HttpSampler来过滤掉,比如swagger这些请求等
@Bean(name = ServerSampler.NAME)
HttpSampler myHttpSampler(SkipPatternProvider provider) {
Pattern pattern = provider.skipPattern();
return new HttpSampler() {
@Override
public <Req> Boolean trySample(HttpAdapter<Req, ?> adapter, Req request) {
String url = adapter.path(request);
boolean shouldSkip = pattern.matcher(url).matches();
if (shouldSkip) {
return false;
}
return null;
}
};
}
核心在trySample方法中,只要不想跟踪的URL直接返回false就可以过滤。规则可以自定,笔者用了SkipPatternProvider来过滤,SkipPatternProvider中的skipPattern配置了很多过滤规则。
/api-docs.*|/autoconfig|/configprops|/dump|/health|/info|/metrics.*|/mappings|/trace|/swagger.*|.*\.png|.*\.css|.*\.js|.*\.html|/favicon.ico|/hystrix.stream|/application/.*|/actuator.*|/cloudfoundryapplication
使用RabbitMq代替Http发送调用链数据
虽然有基于采样的收集方式,但是数据的发送采用Http还是对性能有影响。如果Zipkin的服务端重启或者挂掉了,那么将丢失部分采集数据。为了解决这些问题,我们将集成RabbitMq来发送采集数据,利用消息队列来提高发送性能,保证数据不丢失。
集成方式也比较简单
增加RabbitMq的依赖
<dependency>
<groupId>org.springframework.amqp</groupId>
<artifactId>spring-rabbit</artifactId>
</dependency>
之后属性配置
# 修改zipkin的数据发送方式为RabbitMq
spring.zipkin.sender.type=RABBIT
# rabbitmq 配置
spring.rabbitmq.addresses=amqp://127.0.0.1:5672
spring.rabbitmq.username=admin
spring.rabbitmq.password=123456
Sleuth原理
既然是基于Springboot所以集成进来肯定少不了Springboot的自动装配。
TraceAutoConfiguration
@Configuration
//会读取配置文件,默认是true,因此这个类在启动的时候就会加载了
@ConditionalOnProperty(value="spring.sleuth.enabled", matchIfMissing=true)
//读取配置类,并且注入spring bean
@EnableConfigurationProperties({TraceKeys.class, SleuthProperties.class})
public class TraceAutoConfiguration {
@Autowired
SleuthProperties properties;
@Bean
@ConditionalOnMissingBean
//spanid生成策略,就是随机的
public Random randomForSpanIds() {
return new Random();
}
//是否需要传输到zipkin服务端
@Bean
@ConditionalOnMissingBean
public Sampler defaultTraceSampler() {
return NeverSampler.INSTANCE;
}
@Bean
@ConditionalOnMissingBean(Tracer.class)
/**
* sampler 这个类,如果没有引入spring-cloud-sleuth-zikpin,它就是NeverSampler
* 相反,它会读取spring.sleuth.sampler.percentage来构造一个百分比
* spanReporter一样道理
* Tracer:这是一个非常关键的类,下面单独讲一下。就是通过它来进行一系列的创建span,保存,更新等
*/
public DefaultTracer sleuthTracer(Sampler sampler, Random random,
SpanNamer spanNamer, SpanLogger spanLogger,
SpanReporter spanReporter, TraceKeys traceKeys) {
return new DefaultTracer(sampler, random, spanNamer, spanLogger,
spanReporter, this.properties.isTraceId128(), traceKeys);
}
@Bean
@ConditionalOnMissingBean
public SpanNamer spanNamer() {
return new DefaultSpanNamer();
}
@Bean
@ConditionalOnMissingBean
public SpanReporter defaultSpanReporter() {
return new NoOpSpanReporter();
}
}
TraceWebAutoConfiguration
这个另外一个关键的类,在一般情况下我们都是使用web来进行调用,其他的拓展这里也不讲了,有兴趣可以自己了解一下,大致思想类似
@Configuration
//默认自动加载
//这里一个重点就是加载了Filter和skip的类
@ConditionalOnProperty(value = "spring.sleuth.web.enabled", matchIfMissing = true)
@ConditionalOnWebApplication
@ConditionalOnBean(Tracer.class)
@AutoConfigureAfter(TraceHttpAutoConfiguration.class)
public class TraceWebAutoConfiguration {
/**
* Nested config that configures Web MVC if it's present (without adding a runtime
* dependency to it)
*/
@Configuration
@ConditionalOnClass(WebMvcConfigurerAdapter.class)
@Import(TraceWebMvcConfigurer.class)
protected static class TraceWebMvcAutoConfiguration {
}
@Bean
public TraceWebAspect traceWebAspect(Tracer tracer, TraceKeys traceKeys,
SpanNamer spanNamer) {
return new TraceWebAspect(tracer, spanNamer, traceKeys);
}
@Bean
@ConditionalOnClass(name = "org.springframework.data.rest.webmvc.support.DelegatingHandlerMapping")
public TraceSpringDataBeanPostProcessor traceSpringDataBeanPostProcessor(
BeanFactory beanFactory) {
return new TraceSpringDataBeanPostProcessor(beanFactory);
}
/**
*创建并注册一个Filter,web环境下都会进入filter来
*/
@Bean
public FilterRegistrationBean traceWebFilter(TraceFilter traceFilter) {
FilterRegistrationBean filterRegistrationBean = new FilterRegistrationBean(
traceFilter);
filterRegistrationBean.setDispatcherTypes(ASYNC, ERROR, FORWARD, INCLUDE,
REQUEST);
filterRegistrationBean.setOrder(TraceFilter.ORDER);
return filterRegistrationBean;
}
@Bean
public TraceFilter traceFilter(Tracer tracer, TraceKeys traceKeys,
SkipPatternProvider skipPatternProvider, SpanReporter spanReporter,
HttpSpanExtractor spanExtractor,
HttpTraceKeysInjector httpTraceKeysInjector) {
return new TraceFilter(tracer, traceKeys, skipPatternProvider.skipPattern(),
spanReporter, spanExtractor, httpTraceKeysInjector);
}
@Configuration
@ConditionalOnClass(ManagementServerProperties.class)
@ConditionalOnMissingBean(SkipPatternProvider.class)
@EnableConfigurationProperties(SleuthWebProperties.class)
//skip:正则表达式检测进来的请求是否需要将span 传输到服务端去。
//(下面实现代码可以不看)
protected static class SkipPatternProviderConfig {
@Bean
@ConditionalOnBean(ManagementServerProperties.class)
public SkipPatternProvider skipPatternForManagementServerProperties(
final ManagementServerProperties managementServerProperties,
final SleuthWebProperties sleuthWebProperties) {
return new SkipPatternProvider() {
@Override
public Pattern skipPattern() {
return getPatternForManagementServerProperties(
managementServerProperties,
sleuthWebProperties);
}
};
}
/**
* Sets or appends {@link ManagementServerProperties#getContextPath()} to the skip
* pattern. If neither is available then sets the default one
*/
static Pattern getPatternForManagementServerProperties(
ManagementServerProperties managementServerProperties,
SleuthWebProperties sleuthWebProperties) {
String skipPattern = sleuthWebProperties.getSkipPattern();
if (StringUtils.hasText(skipPattern)
&& StringUtils.hasText(managementServerProperties.getContextPath())) {
return Pattern.compile(skipPattern + "|"
+ managementServerProperties.getContextPath() + ".*");
}
else if (StringUtils.hasText(managementServerProperties.getContextPath())) {
return Pattern
.compile(managementServerProperties.getContextPath() + ".*");
}
return defaultSkipPattern(skipPattern);
}
@Bean
@ConditionalOnMissingBean(ManagementServerProperties.class)
public SkipPatternProvider defaultSkipPatternBeanIfManagementServerPropsArePresent(SleuthWebProperties sleuthWebProperties) {
return defaultSkipPatternProvider(sleuthWebProperties.getSkipPattern());
}
}
@Bean
@ConditionalOnMissingClass("org.springframework.boot.actuate.autoconfigure.ManagementServerProperties")
@ConditionalOnMissingBean(SkipPatternProvider.class)
public SkipPatternProvider defaultSkipPatternBean(SleuthWebProperties sleuthWebProperties) {
return defaultSkipPatternProvider(sleuthWebProperties.getSkipPattern());
}
private static SkipPatternProvider defaultSkipPatternProvider(
final String skipPattern) {
return new SkipPatternProvider() {
@Override
public Pattern skipPattern() {
return defaultSkipPattern(skipPattern);
}
};
}
private static Pattern defaultSkipPattern(String skipPattern) {
return StringUtils.hasText(skipPattern) ? Pattern.compile(skipPattern)
: Pattern.compile(SleuthWebProperties.DEFAULT_SKIP_PATTERN);
}
interface SkipPatternProvider {
Pattern skipPattern();
}
}
TraceHttpAutoConfiguration
这个类的用途是将span信息注入到carrier(这里是http),进行传递作用
@Configuration
@ConditionalOnBean(Tracer.class)
@AutoConfigureAfter(TraceAutoConfiguration.class)
@EnableConfigurationProperties({ TraceKeys.class, SleuthWebProperties.class })
public class TraceHttpAutoConfiguration {
@Bean
@ConditionalOnMissingBean
public HttpTraceKeysInjector httpTraceKeysInjector(Tracer tracer, TraceKeys traceKeys) {
return new HttpTraceKeysInjector(tracer, traceKeys);
}
@Bean
@ConditionalOnMissingBean
public HttpSpanExtractor httpSpanExtractor(SleuthWebProperties sleuthWebProperties) {
return new ZipkinHttpSpanExtractor(Pattern.compile(sleuthWebProperties.getSkipPattern()));
}
@Bean
@ConditionalOnMissingBean
public HttpSpanInjector httpSpanInjector() {
return new ZipkinHttpSpanInjector();
}
}
链路监控的实现
TraceFilter的关键方法
- createSpan()
/**
* Creates a span and appends it as the current request's attribute
*/
private Span createSpan(HttpServletRequest request,
boolean skip, Span spanFromRequest, String name) {
if (spanFromRequest != null) {
if (log.isDebugEnabled()) {
log.debug("Span has already been created - continuing with the previous one");
}
return spanFromRequest;
}
// 从请求中获取信息,上一步是否有信息传入,有就进行解析
Span parent = this.spanExtractor.joinTrace(new HttpServletRequestTextMap(request));
if (parent != null) {
if (log.isDebugEnabled()) {
log.debug("Found a parent span " + parent + " in the request");
}
addRequestTagsForParentSpan(request, parent);
spanFromRequest = parent;
// 更新当前线程的Span
this.tracer.continueSpan(spanFromRequest);
if (parent.isRemote()) {
// 记录当前步骤SR
parent.logEvent(Span.SERVER_RECV);
}
request.setAttribute(TRACE_REQUEST_ATTR, spanFromRequest);
if (log.isDebugEnabled()) {
log.debug("Parent span is " + parent + "");
}
} else {
// carrier中没有span信息
if (skip) {
// 不需要上传
spanFromRequest = this.tracer.createSpan(name, NeverSampler.INSTANCE);
}
else {
String header = request.getHeader(Span.SPAN_FLAGS);
if (Span.SPAN_SAMPLED.equals(header)) {
spanFromRequest = this.tracer.createSpan(name, new AlwaysSampler());
} else {
// 创建一个新的Span,代码就不贴了,就是random产生Id,然后放到当前线程中
spanFromRequest = this.tracer.createSpan(name);
}
}
spanFromRequest.logEvent(Span.SERVER_RECV);
request.setAttribute(TRACE_REQUEST_ATTR, spanFromRequest);
if (log.isDebugEnabled()) {
log.debug("No parent span present - creating a new span");
}
}
return spanFromRequest;
}
/**
* 构造一个ParentSpan
*/
@Override
public Span joinTrace(SpanTextMap textMap) {
Map<String, String> carrier = TextMapUtil.asMap(textMap);
boolean debug = Span.SPAN_SAMPLED.equals(carrier.get(Span.SPAN_FLAGS));
if (debug) {
// we're only generating Trace ID since if there's no Span ID will assume
// that it's equal to Trace ID
generateIdIfMissing(carrier, Span.TRACE_ID_NAME);
} else if (carrier.get(Span.TRACE_ID_NAME) == null) {
// can't build a Span without trace id
return null;
}
try {
String uri = carrier.get(URI_HEADER);
boolean skip = this.skipPattern.matcher(uri).matches()
|| Span.SPAN_NOT_SAMPLED.equals(carrier.get(Span.SAMPLED_NAME));
long spanId = spanId(carrier);
return buildParentSpan(carrier, uri, skip, spanId);
} catch (Exception e) {
log.error("Exception occurred while trying to extract span from carrier", e);
return null;
}
}
private Span buildParentSpan(Map<String, String> carrier, String uri, boolean skip, long spanId) {
String traceId = carrier.get(Span.TRACE_ID_NAME);
Span.SpanBuilder span = Span.builder()
.traceIdHigh(traceId.length() == 32 ? Span.hexToId(traceId, 0) : 0)
.traceId(Span.hexToId(traceId))
.spanId(spanId);
String processId = carrier.get(Span.PROCESS_ID_NAME);
String parentName = carrier.get(Span.SPAN_NAME_NAME);
if (StringUtils.hasText(parentName)) {
span.name(parentName);
} else {
span.name(HTTP_COMPONENT + ":/parent" + uri);
}
if (StringUtils.hasText(processId)) {
span.processId(processId);
}
if (carrier.containsKey(Span.PARENT_ID_NAME)) {
span.parent(Span.hexToId(carrier.get(Span.PARENT_ID_NAME)));
}
span.remote(true);
boolean debug = Span.SPAN_SAMPLED.equals(carrier.get(Span.SPAN_FLAGS));
//是否要上传span
if (debug) {
span.exportable(true);
} else if (skip) {
span.exportable(false);
}
for (Map.Entry<String, String> entry : carrier.entrySet()) {
if (entry.getKey().startsWith(Span.SPAN_BAGGAGE_HEADER_PREFIX + HEADER_DELIMITER)) {
span.baggage(unprefixedKey(entry.getKey()), entry.getValue());
}
}
return span.build();
}
- addErrorTag()
//如果请求期间发生了异常,将异常信息记录到Span中
catch (Throwable e) {
exception = e;
this.tracer.addTag(Span.SPAN_ERROR_TAG_NAME, ExceptionUtils.getExceptionMessage(e));
throw e;
}
- closeSpan()关闭 Span并上传
finally {
if (isAsyncStarted(request) || request.isAsyncStarted()) {
if (log.isDebugEnabled()) {
log.debug("The span " + spanFromRequest + " will get detached by a HandleInterceptor");
}
// TODO: how to deal with response annotations and async?
return;
}
spanFromRequest = createSpanIfRequestNotHandled(request, spanFromRequest, name, skip);
detachOrCloseSpans(request, response, spanFromRequest, exception);
}
private void recordParentSpan(Span parent) {
if (parent == null) {
return;
}
if (parent.isRemote()) {
if (log.isDebugEnabled()) {
log.debug("Trying to send the parent span " + parent + " to Zipkin");
}
parent.stop();
// should be already done by HttpServletResponse wrappers
SsLogSetter.annotateWithServerSendIfLogIsNotAlreadyPresent(parent);
this.spanReporter.report(parent);
} else {
// should be already done by HttpServletResponse wrappers
SsLogSetter.annotateWithServerSendIfLogIsNotAlreadyPresent(parent);
}
}
@Override
public Span close(Span span) {
if (span == null) {
return null;
}
Span cur = SpanContextHolder.getCurrentSpan();
final Span savedSpan = span.getSavedSpan();
if (!span.equals(cur)) {
ExceptionUtils.warn(
"Tried to close span but it is not the current span: " + span
+ ". You may have forgotten to close or detach " + cur);
}
else {
//统计Span存在时间,也就是调用时间
span.stop();
if (savedSpan != null && span.getParents().contains(savedSpan.getSpanId())) {
this.spanReporter.report(span);
this.spanLogger.logStoppedSpan(savedSpan, span);
}
else {
if (!span.isRemote()) {
//上传span,这是spring-sleuth-zikpin的活
this.spanReporter.report(span);
this.spanLogger.logStoppedSpan(null, span);
}
}
//移除当前线程的Span
SpanContextHolder.close(new SpanContextHolder.SpanFunction() {
@Override public void apply(Span span) {
DefaultTracer.this.spanLogger.logStoppedSpan(savedSpan, span);
}
});
}
return savedSpan;
}
TraceFeignClient 这里的代码很好理解,就是创建Span,并将信息传入到carrier
其他Client的实现不讲了,挑一个Feign,因为都差不多
@Override
public Response execute(Request request, Request.Options options) throws IOException {
String spanName = getSpanName(request);
Span span = getTracer().createSpan(spanName);
if (log.isDebugEnabled()) {
log.debug("Created new Feign span " + span);
}
try {
AtomicReference<Request> feignRequest = new AtomicReference<>(request);
spanInjector().inject(span, new FeignRequestTextMap(feignRequest));
span.logEvent(Span.CLIENT_SEND);
addRequestTags(request);
Request modifiedRequest = feignRequest.get();
if (log.isDebugEnabled()) {
log.debug("The modified request equals " + modifiedRequest);
}
Response response = this.delegate.execute(modifiedRequest, options);
logCr();
return response;
} catch (RuntimeException | IOException e) {
logCr();
logError(e);
throw e;
} finally {
//关闭Span,并上传
closeSpan(span);
}
}