Spark源码修改环境搭建

过程中存在问题:

  • maven编译scala工程报错java.lang.NoClassDefFoundError: scala/reflect/internal/Trees,解决方案看maven编译

1.新建Maven工程sparkdemo-2.3.3

过程略

2.配置pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>sparkdemo-2.3.3</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
        <scala.binary.version>2.11</scala.binary.version>
        <spark.version>2.3.3</spark.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-launcher_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-yarn_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-network-common_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-tags_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>2.11.8</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-compiler</artifactId>
            <version>2.11.8</version>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-reflect</artifactId>
            <version>2.11.8</version>
        </dependency>
        <dependency>
            <groupId>org.json4s</groupId>
            <artifactId>json4s-ast_2.11</artifactId>
            <version>3.5.3</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-server-nodemanager</artifactId>
            <version>2.6.5</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.4.6</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                        </goals>
                    </execution>
                </executions>
                <configuration>
                    <scalaVersion>2.11.8</scalaVersion>
                </configuration>
            </plugin>
        </plugins>
    </build>

</project>

注意:要增加scala的编译插件,否则无法编译scala代码

3.新增scala代码,编译

如遇到编译报错,如下:

[INFO] 
[INFO] --- scala-maven-plugin:3.4.6:compile (default) @ sparkdemo-2.3.3 ---
[INFO] D:\07_code\java\workspace\sparkdemo-2.3.3\src\main\java:-1: info: compiling
[INFO] D:\07_code\java\workspace\sparkdemo-2.3.3\src\main\scala:-1: info: compiling
[INFO] Compiling 2 source files to D:\07_code\java\workspace\sparkdemo-2.3.3\target\classes at 1655537041073
[ERROR] java.lang.NoClassDefFoundError: scala/reflect/internal/Trees
[INFO] 	at java.lang.Class.getDeclaredMethods0(Native Method)
[INFO] 	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
[INFO] 	at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
[INFO] 	at java.lang.Class.getMethod0(Class.java:3018)
[INFO] 	at java.lang.Class.getMethod(Class.java:1784)
[INFO] 	at scala_maven_executions.MainHelper.runMain(MainHelper.java:155)
[INFO] 	at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  3.687 s
[INFO] Finished at: 2022-06-18T15:24:01+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.4.6:compile (default) on project sparkdemo-2.3.3: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: -10000 (Exit value: -10000) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

修改编译命令,增加命令行参数(scala.home):

scala:compile compile -Dscala.home=D:\06_devptools\scala-2.11.8 -f pom.xml

此时编译成功。

4.查找要修改的类并复制源码

如在ExecutorRunnable中增加日志,排查容器启动相关信息,Ctrl+N找到类,下载源码,复制源码。在scala目录下新建org.apache.spark.deploy.yarn包,然后在此package下复制,即可得到源码。修改后执行编译命令,在target下面可找到对应的class,替换原jar包中的类即可。

posted @ 2022-06-18 15:31  风雨咒之无上密籍  阅读(101)  评论(0编辑  收藏  举报