win10配置hadoop环境

安装hadoop和mapreduce详解以及避坑指南


win10开发环境配置(括号中为我的安装路径,按需修改)
  1. 下载hadoop(D:\安装包\Download\hadoop)

    https://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.2.1/

    Hadoop3.2.1有坑,不建议安装这个,坑直接翻到最后。

  2. 下载windows binaries and winutils for Hadoop 3.2.1(版本可以和上面不一致,D:\安装包\Download\hadoop)

    https://github.com/selfgrowth/apache-hadoop-3.1.1-winutils,加压后覆盖hadoop中的bin目录。

  3. 拷贝bin下的hadoop.dll到C:\\Window\system32

  4. 添加环境变量

    HADOOP_HOME=D:\hadoop\hadoop-3.2.1

    添加path %HADOOP_HOME%\bin

  5. 报错:Hadoop Error: JAVA_HOME is incorrectly set.

    JAVA_HOME的路径中是否含有空格,比如Program files这种的,如果是这种,请将空格部分加上英文的双引号。


配置maven中的pom.xml依赖项
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.bjfu.jichuang</groupId>
    <artifactId>my-wordcount</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>
    <description></description>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>1.8</java.version>
        <hadoop.version>3.2.1</hadoop.version>
        <log4j.version>1.2.17</log4j.version>
        <mockito.version>1.8.5</mockito.version>
        <junit.version>4.10</junit.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>${log4j.version}</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.5</version>
        </dependency>
        <dependency>
            <groupId>org.mockito</groupId>
            <artifactId>mockito-all</artifactId>
            <version>${mockito.version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>${junit.version}</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.3.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>
                            jar-with-dependencies
                        </descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>
                                single
                            </goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

启动hadoop

  1. 修改core-site.xml(D:\hadoop\hadoop-3.2.1\etc\hadoop)

    新建tmp文件夹和name文件夹

    <configuration>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/D:/hadoop/hadoop-3.2.1/workplace/tmp</value>
        </property>
        <property>
            <name>dfs.name.dir</name>
            <value>/D:/hadoop/hadoop-3.2.1/workplace/name</value>
        </property>
        <property>
            <name>fs.default.name</name>
            <value>hdfs://localhost:9000</value>
        </property>
    </configuration>
    
  2. 修改hdfs-site.xml

    新建datanode和namenode文件夹后修改对应内容

    <configuration>
        <!-- 这个参数设置为1,因为是单机版hadoop -->
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
        <property>
            <name>dfs.data.dir</name>
            <value>/D:/hadoop/hadoop-3.2.1/workplace/data</value>
        </property>
    </configuration>
    
  3. 修改mapred-site.xml

    <configuration>
        <property>
           <name>mapreduce.framework.name</name>
           <value>yarn</value>
        </property>
        <property>
           <name>mapred.job.tracker</name>
           <value>hdfs://localhost:9001</value>
        </property>
    </configuration>
    
  4. 修改yarn-site.xml

    <configuration>
        <property>
           <name>yarn.nodemanager.aux-services</name>
           <value>mapreduce_shuffle</value>
        </property>
        <property>
           <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
           <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
    </configuration>
    
  5. 编辑“hadoop”目录下的hadoop-env.cmd文件

    @rem set JAVA_HOME=%JAVA_HOME%
    
    set JAVA_HOME=D:\java\jdk --jdk安装路径
    
  6. 格式化namenode

    D:\安装包\Download\hadoop\hadoop-3.2.1\hadoop-3.2.1\bin>hadoop namenode -format
    
  7. 启动hadoop

    D:\安装包\Download\hadoop\hadoop-3.2.1\hadoop-3.2.1\sbin>start-all.cmd
    

yarn运行成功,访问http://localhost:8088/cluster/apps

img


坑1:namenode格式化报错(3.2.1通病)

https://kontext.tech/column/hadoop/377/latest-hadoop-321-installation-on-windows-10-step-by-step-guide

https://www.cnblogs.com/yifengjianbai/p/8258898.html

坑2:http://localhost:50070/无法访问

坑3:启动yarn的时候,无法启动nodemanager

Failed to setup local dir D:/hadoop/tmp/nm-local-dir, which was marked as good.

管理员权限问题,使用管理员权限运行start-yarn.cmd即可

坑4:8088端口UI界面不显示yarn执行的任务

在$HADOOP_HOME/conf/mapred-site.xml,添加如下代码:

<property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
</property>
<property>
     <name>mapreduce.jobhistory.address</name>
     <value>master:10020</value>
 </property>
 <property>
     <name>mapreduce.jobhistory.webapp.address</name>
     <value>master:19888</value>
</property>

坑5:Hadoop项目出现No such file or directory错误

使用管理员身份运行ide即可。

posted @ 2020-05-09 20:40  火车不是推的  阅读(634)  评论(0编辑  收藏  举报