centos6.5/7.2 上Hadoop2.7.3集群部署

操作系统          hadoop版本         jdk版本 Centos6.5/7.2   hadoop2.7.3    jdk-7u79-linux-x64.gz 主机列表: master    ip: 192.168.0.251 slave   ip: 192.168.0.253 设置hosts文件,2台主机保持一样 [root@master ~]# cat /etc/hosts 192.168.0.251     master 192.168.0.253      slave

SSH 免密码登录安装,配置

分别在master,slave上操作: [root@master ~]# ssh-keygen -t rsa 敲击enter 直到结束。 [root@master ~]# ssh-copy-id -i slave [root@master ~]# ssh-copy-id -i master [root@slave~]# ssh-copy-id -i slave [root@slave ~]# ssh-copy-id -i master

JDK的安装与卸载

2台主机同样的操作 卸载 JDK # 检查当前操作系统默认安装的JDK rpm -qa|grep jdk 若是有安装的话,则卸载 rpm -e  --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64 rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64 安装JDK 在home目录下创建java目录并且下载JDK,然后解压到 /home/java 目录下 cd /home mkdir java cd java tar -zxvf jdk-7u79-linux-x64.gz 编辑 vim /etc/profile 文件并且在末尾追加 export JAVA_HOME=/home/java/jdk1.7.0_79 export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin 在不重新启动操作系统的情况下使 /etc/profile 文件生效 source /etc/profile 检查java的安装状态 [root@master java]# java -version java version "1.7.0_79"Java(TM) SE Runtime Environment (build 1.7.0_79-b15)Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)[root@master java]#

Master主机上安装 Hadoop2.7.3

下载Hadoop-2.7.3 mkdir /usr/hadoop wget http://apache.fayea.com/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz 在/usr/hadoop解压 hadoop-2.7.3.tar.gz tar -zxvf hadoop-2.7.3.tar.gz [root@master hadoop]# pwd /usr/hadoop [root@master hadoop]# mkdir -p dfs/name [root@master hadoop]# mkdir -p dfs/data [root@master hadoop]# mkdir tmp 配置hadoop的环境变量,分别在2台主机的/etc/profile下追加 vim /etc/profile 修改PATH,追加hadoop变量设置 export PATH=$PATH: /usr/Hadoop/hadoop-2.7.3/sbin: /usr/Hadoop/hadoop-2.7.3/binsource /etc/profile使其生效

修改环境配置文件

以下所有操作均在cd /usr/Hadoop/hadoop-2.7.3 目录下 修改 etc/hadoop/hadoop-env.sh   yarn-env.sh   mapred-env.sh文件 增加JAVA_HOME记录 # The java implementation to use.#export JAVA_HOME=${JAVA_HOME} export JAVA_HOME=/home/java/jdk1.7.0_79

修改hadoop相关配置文件

修改 etc/hadoop/core-site.xml 文件 <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> </configuration> 修改 etc/hadoop/hdfs-site.xml 文件 <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/hadoop/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>   修改 etc/hadoop/yarn-site.xml 文件 <configuration> <!-- Site specific YARN configuration properties -->   <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> </configuration> 修改 etc/hadoop/mapred-site.xml 文件默认没有,则cp  mapred-site.xml.example 一份 <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> </configuration> 修改 etc/hadoop/slaves 文件,添加Slave [root@master hadoop]# cat slaves slave

拷贝hadoop到slave主机

打包文件夹 /usr/hadoop/hadoop-2.7.3 ,复制到 slave节点机同样的目录,保证节点机环境配置与master保持一致 scp –r /usr/hadoop  root@slave:/usr/

格式化集群

在master执行 [root@master hadoop]hdfs namenode –format 看到 successfully 字段就成功了。其显示不是很明显,请仔细查看。

启动集群

启动文件服务 [root@master hadoop-2.7.3]# pwd /usr/hadoop/hadoop-2.7.3 [root@master hadoop-2.7.3]# sbin/start-all.sh This script is Deprecated. Instead use start-dfs.sh and start-yarn.shStarting namenodes on [master]master: starting namenode, logging to /usr/hadoop/hadoop-2.7.3/logs/hadoop-root-namenode-master.outslave: starting datanode, logging to /usr/hadoop/hadoop-2.7.3/logs/hadoop-root-datanode-slave.outStarting secondary namenodes [master]master: starting secondarynamenode, logging to /usr/hadoop/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-master.outstarting yarn daemonsstarting resourcemanager, logging to /usr/hadoop/hadoop-2.7.3/logs/yarn-root-resourcemanager-master.outslave: starting nodemanager, logging to /usr/hadoop/hadoop-2.7.3/logs/yarn-root-nodemanager-slave.out 执行这个命令后,会启动namenode(master),datanode(slave)相关进程。

验证集群

分别在各个主机上执行 jps 查看服务情况 [root@master hadoop-2.7.3]# jps 24830 SecondaryNameNode 25252 Jps 24635 NameNode 24993 ResourceManager [root@master hadoop-2.7.3]# [root@slave hadoop-2.7.3]# jps 15844 Jps 15596 DataNode 15713 NodeManager

测试hadoop

创建用户文件系统文件夹 [root@master hadoop-2.7.3]# hdfs dfs -mkdir /input [root@master hadoop-2.7.3]# hdfs dfs -ls / Found 2 items drwxr-xr-x  - root supergroup   0 2016-12-08 16:32 /input drwx------  - root supergroup   0 2016-12-06 14:06 /tmp 复制文件本地文件到分布式文件系统 input 下 [root@master hadoop-2.7.3]# hdfs dfs -put etc/hadoop/* /input 执行wordcound [root@master hadoop-2.7.3]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /input /output 16/12/08 16:35:06 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.0.251:8032 16/12/08 16:35:06 INFO input.FileInputFormat: Total input paths to process : 31 16/12/08 16:35:06 INFO mapreduce.JobSubmitter: number of splits:31 16/12/08 16:35:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1481185761339_0001 16/12/08 16:35:07 INFO impl.YarnClientImpl: Submitted application application_1481185761339_0001 16/12/08 16:35:07 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1481185761339_0001/ 16/12/08 16:35:07 INFO mapreduce.Job: Running job: job_1481185761339_0001 16/12/08 16:35:13 INFO mapreduce.Job: Job job_1481185761339_0001 running in uber mode : false 16/12/08 16:35:13 INFO mapreduce.Job:? map 0% reduce 0% 16/12/08 16:35:20 INFO mapreduce.Job:? map 19% reduce 0% 16/12/08 16:35:24 INFO mapreduce.Job:? map 23% reduce 0% 16/12/08 16:35:25 INFO mapreduce.Job:? map 26% reduce 0% 省略 16/12/08 16:35:44 INFO mapreduce.Job:? map 94% reduce 30% 16/12/08 16:35:45 INFO mapreduce.Job:? map 97% reduce 30% 16/12/08 16:35:46 INFO mapreduce.Job:? map 100% reduce 30% 16/12/08 16:35:47 INFO mapreduce.Job:? map 100% reduce 100% 省略 CPU time spent (ms)=11580 Physical memory (bytes) snapshot=8510996480 Virtual memory (bytes) snapshot=28232007680 Total committed heap usage (bytes)=6442450944 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=81346 File Output Format Counters Bytes Written=37663 [root@master hadoop-2.7.3]# 查看分布式文件系统文件内容 [root@master hadoop-2.7.3]# hdfs dfs -ls /output Found 2 items -rw-r--r-- 1 root supergroup 0 2016-12-08 16:35 /output/_SUCCESS -rw-r--r-- 1 root supergroup 37663 2016-12-08 16:35 /output/part-r-00000 查看结果值 [root@master hadoop-2.7.3]# hdfs dfs -cat /output/part-r-00000

web 访问页面

http://192.168.0.251:50070/ http://192.168.0.251:8088

posted on 2016-12-08 16:25  歪歪121  阅读(80)  评论(0)    收藏  举报