【Flink系列十九】Flink 作业Hadoop 依赖冲突解决NoSuchMethodError

问题

Flink 提交作业,直接报错:

java.lang.NoSuchMethodError: org.apache.hadoop.tracing.TraceUtils.wrapHadoopConf(Ljava/lang/String;Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/htrace/core/HTraceConfiguration;
	at org.apache.hadoop.fs.FsTracer.get(FsTracer.java:42)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:689)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:673)
	at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:155)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
	at org.apache.hadoop.fs.viewfs.ChRootedFileSystem.<init>(ChRootedFileSystem.java:103)
	at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:173)
	at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.getTargetFileSystem(ViewFileSystem.java:167)
	at org.apache.hadoop.fs.viewfs.InodeTree.createLink(InodeTree.java:261)
	at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:333)
	at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:167)
	at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:167)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
	at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:770)
	at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593)
	at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:458)
	at org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67)
	at org.apache.flink.client.cli.CliFrontend.runApplication(CliFrontend.java:213)
	at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1057)
	at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
	at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
	at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)

可疑的Library

  • hbase-shaded-client-1.4.3.jar

分析

查看 hadoop-common 2.7.x 的源码如下:

import org.apache.htrace.HTraceConfiguration;

/**
 * This class provides utility functions for tracing.
 */
@InterfaceAudience.Private
public class TraceUtils {
  private static List<ConfigurationPair> EMPTY = Collections.emptyList();

  public static HTraceConfiguration wrapHadoopConf(final String prefix,
        final Configuration conf) {
    return wrapHadoopConf(prefix, conf, EMPTY);
  }

发现报错的是

不仔细看,还真看不出来不是方法没有,是返回值的包不对。

NoSuchMethod TraceUtils实际的Method
public static HTraceConfiguration wrapHadoopConf(final String prefix, final Configuration conf) public static HTraceConfiguration wrapHadoopConf(final String prefix, final Configuration conf)
Lorg/apache/htrace/core/HTraceConfiguration; 返回值 org.apache.htrace.HTraceConfiguration

hadoop 的发型版差异:

  • Apache Hadoop Common:2.6.x 没有TraceUtils这个类,2.7.x 才有。
  • Apache Hadoop Common: 2.6.0-CDH5.12.1 有这个类。

查看 2.6.0-CDH5.12.1的 TraceUtils

import org.apache.htrace.core.HTraceConfiguration;

查看 2.7.x 的TraceUtils

import org.apache.htrace.HTraceConfiguration;

所以错误发生在,2.6.0-cdh5.12.1的FsTracer通过hadoop-common 加载TraceUtils,而实际上加载了2.7.x的TraceUtils。

解决

第一类:从 jar 包入手,手动排除依赖

  • 方案1:删除2.7.x的 hadoop-common,或者shade了这个包的library。
  • 方案2:增加2.7.x的 hadoop-hdfs,它会加载相应的 2.7.x的hadoop-common,或者shade了这个包的 library。
  • 方案3:不要使用 hbase-shaded-client,内部包含了hadoop依赖,容易和集群冲突。更换为 hbase-client。

第二类:通过打包工具精确排除字节码

  • 用 shade 插件 的filter 排除所有 hadoop, htrace 依赖。
 <filters>
    <filter>
        <artifact>org.apache.hbase:hbase-shaded-client</artifact>
        <includes>
            <include>META-INF/**</include>
            <include>org/apache/hadoop/hbase/**</include>
            <include>hbase-default.xml</include>
        </includes>
    </filter>
    <filter>
        <!-- Do not copy the signatures in the META-INF folder.
        Otherwise, this might cause SecurityExceptions when using the JAR. -->
        <artifact>*:*</artifact>
        <excludes>
            <exclude>META-INF/*.SF</exclude>
            <exclude>META-INF/*.DSA</exclude>
            <exclude>META-INF/*.RSA</exclude>
        </excludes>
    </filter>
</filters>

自问自答

为什么要费这么大劲儿解决这个问题?

  • 因为用户使用的flink-shaded-hadoop-2-uber-xxxx-xxx 这种jar,如果内含hadoop-hdfs-2.8.2 之前的版本,则会遇到HDFS-9276的BUG。
  • 且Flink从1.11+起就放弃更新flink-shaded-hadoop-2-uber-jar。

参见 上一篇文章:
【Flink系列十八】HDFS_DELEGATION_TOKEN过期的问题解决汇总

posted @ 2023-09-18 17:56  一杯半盏  阅读(353)  评论(0编辑  收藏  举报