hadoop2.7.x运行wordcount程序卡住在INFO mapreduce.Job: Running job:job _1469603958907_0002

一、抛出问题　　

　　Hadoop集群（全分布式）配置好后，运行wordcount程序测试，发现每次运行都会卡住在Running job处，然后程序就呈现出卡死的状态。

　　wordcount运行命令：[hadoop@master hadoop-2.7.2]$ /opt/module/hadoop-2.7.2/bin/hadoop jar /opt/module/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /wc/mytemp/123 /wc/mytemp/output

　　现象截图如下：卡死在红线部分：

二、解决方法

　　1、因为小白一枚，到网上找了很多教程，集中说法如下：

　　　　（1）有的说，是防火墙或者selinux没关闭，然后，就去一一查看，发现全部关闭

　　　　（2）有的说，是因为/etc/hosts文件中的127.0.0.1等多余的ip地址没删除或者没注释调

　　　　（3）有的人说，查看日志（what？小白哪知道哪个日志），然后不了了之。

　　2、解决办法：　　

　　小白解决问题总是会花费很多时间的，因此半天就这样没了，很对不起公司的工资啊，现将解决办法一一列出。

　　（1）第一步：因为Running job发生的问题，在hadoop 中我们要想到mapreduce发生的问题，在Hadoop2.x系列中MapReduce是通过yarn进行管理的，因此我们查看yarn-hadoop-nodemanager-slave01.log 日志，该日志在slave节点的￥{HADOOP_HOME}/logs下面

终端执行shell指令：yarn-hadoop-nodemanager-slave01.log

查看到日志截图如下：

2016-07-27 03:30:51,041 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-27 03:30:52,043 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-27 03:30:53,046 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-27 03:30:54,047 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-27 03:30:55,048 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-27 03:30:56,050 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-07-27 03:31:27,053 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

（2）大概的解释一下意思

　　就是说每次Client试图连接0.0.0.0/0.0.0.0:8031失败，那么导致这个原因，应该能想到是配置问题，然后复制这段信息进行百度，尝试了几个，终于参考了此博客（解决Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is... ）解决了本文的问题，将下述代码添加到yare-site.xml中：（注意我将master、slave01、slave02这个文件都修改了，是不是只修改master就可以，不清楚，但是初步判断应该全部修改）

<property>  
    <name>yarn.resourcemanager.address</name>  
    <value>master:8032</value>  
  </property>  
  <property>  
    <name>yarn.resourcemanager.scheduler.address</name>  
    <value>master:8030</value>  
  </property>  
  <property>  
    <name>yarn.resourcemanager.resource-tracker.address</name>  
    <value>master:8031</value>  
  </property>

然后插入后的效果如图：

（3）问题解决

再次运行wordcount程序成功：

[hadoop@master hadoop-2.7.2]$ /opt/module/hadoop-2.7.2/bin/hadoop jar /opt/module/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /wc/mytemp/123 /wc/mytemp/output
16/07/27 03:33:29 INFO client.RMProxy: Connecting to ResourceManager at master/172.16.95.100:8032
16/07/27 03:33:31 INFO input.FileInputFormat: Total input paths to process : 1
16/07/27 03:33:31 INFO mapreduce.JobSubmitter: number of splits:1
16/07/27 03:33:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1469604761767_0001
16/07/27 03:33:32 INFO impl.YarnClientImpl: Submitted application application_1469604761767_0001
16/07/27 03:33:32 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1469604761767_0001/
16/07/27 03:33:32 INFO mapreduce.Job: Running job: job_1469604761767_0001
16/07/27 03:33:47 INFO mapreduce.Job: Job job_1469604761767_0001 running in uber mode : false
16/07/27 03:33:47 INFO mapreduce.Job: map 0% reduce 0%
16/07/27 03:33:55 INFO mapreduce.Job: map 100% reduce 0%
16/07/27 03:34:08 INFO mapreduce.Job: map 100% reduce 100%
16/07/27 03:34:08 INFO mapreduce.Job: Job job_1469604761767_0001 completed successfully
16/07/27 03:34:08 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=1291
                FILE: Number of bytes written=237185
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=1498
                HDFS: Number of bytes written=1035
                HDFS: Number of read operations=6
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=1
                Launched reduce tasks=1
                Data-local map tasks=1
                Total time spent by all maps in occupied slots (ms)=6738
                Total time spent by all reduces in occupied slots (ms)=9139
                Total time spent by all map tasks (ms)=6738
                Total time spent by all reduce tasks (ms)=9139
                Total vcore-milliseconds taken by all map tasks=6738

用如下命令可以查看统计结果：

posted @ 2016-07-27 16:09 YouxiBug 阅读(13460) 评论(2) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

YouxiBug

No picture you say J8 a

hadoop2.7.x运行wordcount程序卡住在INFO mapreduce.Job: Running job:job _1469603958907_0002

公告