dubbo的异常栈问题

1.现象:

    当dubbo provider 抛出异常时,dubbo consumer 在输出异常栈信息时,展示的都是provider 侧的线程栈,这是因为,异常的stackTrace实在在new Throwable()的时候生成的。

      /**
     * Constructs a new throwable with {@code null} as its detail message.
     * The cause is not initialized, and may subsequently be initialized by a
     * call to {@link #initCause}.
     *
     * <p>The {@link #fillInStackTrace()} method is called to initialize
     * the stack trace data in the newly created throwable.
     */
    public Throwable() {
        fillInStackTrace();
    }

    /**
     * Fills in the execution stack trace. This method records within this
     * {@code Throwable} object information about the current state of
     * the stack frames for the current thread.
     *
     * <p>If the stack trace of this {@code Throwable} {@linkplain
     * Throwable#Throwable(String, Throwable, boolean, boolean) is not
     * writable}, calling this method has no effect.
     *
     * @return  a reference to this {@code Throwable} instance.
     * @see     java.lang.Throwable#printStackTrace()
     */
    public synchronized Throwable fillInStackTrace() {
        if (stackTrace != null ||
            backtrace != null /* Out of protocol state */ ) {
            fillInStackTrace(0);
            stackTrace = UNASSIGNED_STACK;
        }
        return this;
    }
    //执行native方法
    private native Throwable fillInStackTrace(int dummy);



    这里的关键在fillInStackTrace native 方法,会将当前线程栈的信息填充进stackTrace中;但是分布式服务的调用链中各个服务,都是不同进程,更是不同线程,所以这里的stackTrace只会有发生异常的provider的线程栈信息。当consumer接收到异常时,哪怕log出来,也只有provider侧的相关信息,丢失了consumer侧的线程栈信息;而这一现象,在复杂的dubbo调用链中,是无法满足开发人员对异常分析的需求的;


2.解决办法:

    为了解决这个问题,首先想到的是ExceptionFilter.class,dubbo自带的ExceptionFilter.class是只对provider生效的,对异常是否需要包装成RuntimeException进行判断;那我们可以相应的实现一个consumer侧的ConsumerExceptionFilter去实现,当有provider返回异常时,对异常栈进行追加当前consumer侧的线程栈;这样就变相的实现了跨线程的异常栈了;




    这里是实现代码,为了避免异常栈过大,在代码实现时,追加的异常栈只取了当前服务的执行api行为的位置;(Filter要起作用,是要在org.apache.dubbo.rpc.Filter中添加自定义filter的配置)

@Activate(group = "consumer")
public class ConsumerExceptionFilter extends ListenableFilter {

    public ConsumerExceptionFilter() {
        super.listener = new ConsumerExceptionFilter.ExceptionListener();
    }


    @Override
    public Result invoke(Invoker<?> invoker, Invocation invocation) throws RpcException {
        return invoker.invoke(invocation);
    }


    static class ExceptionListener implements Listener {

        private Logger logger = LoggerFactory.getLogger(ConsumerExceptionFilter.ExceptionListener.class);

        @Override
        public void onResponse(Result appResponse, Invoker<?> invoker, Invocation invocation) {
            if (appResponse.hasException() && GenericService.class != invoker.getInterface()) {
                try {
                    Throwable exception = appResponse.getException();

                    // directly throw if it's checked exception
                    if (!(exception instanceof RuntimeException) && (exception instanceof Exception)) {
                        return;
                    }

                    //这段代码的主要目的是为了将consumer方的部分stackTrace追加到provider抛出来的异常的stackTrace
                    //方便在复杂调用环境中,追踪异常位置
                    StackTraceElement[] stackTrace = exception.getStackTrace();
                    StackTraceElement[] newStackTrace = Arrays.copyOf(stackTrace, stackTrace.length + 1);
                    StackTraceElement[] consumerStackTrace = new RuntimeException().getStackTrace();
                    boolean meetProxyElement = false;
                    for (StackTraceElement consumerStackTraceElement : consumerStackTrace) {

                        if (meetProxyElement){
                            //这里为了节省资源,只追加一行stackTrace(执行api代码位置的stackTrace)
                            newStackTrace[newStackTrace.length-1] = consumerStackTraceElement;
                            break;
                        }
                        //dubbo的调用使用动态代理,所以stackTraceElement的className会是com.sun.proxy.$Proxy,
                        //它的下一个stackTraceElement就是真正的调用方位置
                        if(
                            Objects.equals(consumerStackTraceElement.getMethodName(), invocation.getMethodName())
                                && consumerStackTraceElement.getClassName().startsWith("com.sun.proxy")){
                            meetProxyElement = true;
                        }

                    }
                    exception.setStackTrace(newStackTrace);

                    return;
                } catch (Throwable e) {
                    logger.warn("Fail to ConsumerExceptionFilter when execute " + RpcContext.getContext().getRemoteHost() + ". service: " + invoker.getInterface().getName() + ", method: " + invocation.getMethodName() + ", exception: " + e.getClass().getName() + ": " + e.getMessage(), e);
                    return;
                }
            }
        }

        @Override
        public void onError(Throwable e, Invoker<?> invoker, Invocation invocation) {
            logger.error("Got unchecked and undeclared exception which from " + RpcContext.getContext().getRemoteHost() + ". service: " + invoker.getInterface().getName() + ", method: " + invocation.getMethodName() + ", exception: " + e.getClass().getName() + ": " + e.getMessage(), e);

        }
    }
}

posted on 2020-08-13 18:20  mindSucker  阅读(399)  评论(0编辑  收藏  举报