Ubuntu14.04部署hadoop1.1.2集群

环境:

系统:Ubuntu14.04? hadoop1.1.2 Master? 192.168.0.221 Slave?? 192.168.0.222  

在Ubuntu下创建hadoop用户组和用户

2台主机都得操作 1.添加hadoop用户到系统用户 zhang@master:~$?sudo?addgroup?hadoop zhang@master:~$?sudo?adduser?--ingroup?hadoop?hadoop?????
  1. 现在只是添加了一个用户hadoop,它并不具备管理员权限,我们给hadoop用户添加权限,打开/etc/sudoers文件
zhang@master:~$?sudo?gedit?/etc/sudoers 在root??ALL=(ALL:ALL)??ALL下添加hadoop??ALL=(ALL:ALL)??ALL  

配置SSH,安装Java,hadoop

1、2台主机都得安装ssh 1)?由于Hadoop用ssh通信,先安装ssh. 注意,我先从zhang用户转到了hadoop. zhang@master:~$?su?-?hadoop 密码: hadoop@master:~$?sudo?apt-get?install?openssh-server 因为我的机器已安装最新版的ssh,因此这一步实际上什么也没做。   2) 假设ssh安装完成,先启动服务。启动后,可以通过命令查看服务是否正确启动: hadoop@master:~$?sudo?/etc/init.d/ssh?start hadoop@master:~$?ps?-e?|grep?ssh 759??????????00:00:00?sshd 1691??????????00:00:00?ssh-agent 12447??????????00:00:00?ssh 12448??????????00:00:00?sshd 12587??????????00:00:00?sshd hadoop@master:~$ 3) 作为一个安全通信协议(ssh生成密钥有rsa和dsa两种生成方式,默认情况下采用rsa方式),使用时需要密码,因此我们要设置成免密码登录,生成私钥和公钥: hadoop@master:~$?ssh-keygen?-t?rsa?-P?"" hadoop@master:~$ ssh-copy-id -i slave hadoop@master:~$?ssh-copy-id -i master hadoop@slave:~$ ssh-keygen -t rsa -P '' hadoop@slave:~$ ssh-copy-id -i slave hadoop@slave:~$ ssh-copy-id -i master 安装Java 2台主机都得操作过程参考我另外的文章,centos6.5部署hadoop2.7.3集群文章。Java部署在hadoop用户下面。 hadoop@master:~$ java -version java version "1.7.0_79" Java(TM) SE Runtime Environment (build 1.7.0_79-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode) hadoop@master:~$ master安装hadoop-1.1.2 将hadoop解压到到/usr/local/hadoop hadoop@master:/usr/local$?sudo?tar?xzf?hadoop-1.1.2.tar.gz hadoop@master:/usr/local$?sudo?mv?hadoop-1.1.2?/usr/local/hadoop 要确保所有的操作都是在用户hadoop下完成的,所以将该hadoop文件夹的属主用户设为hadoop hadoop@master:/usr/local$?sudo?chown?-R?hadoop:hadoop?hadoop 配置hadoop-env.sh 进入用hadoop用户登录,进入/usr/localhadoop目录,打开conf目录的hadoop-env.sh,添加以下信息:(找到#export JAVA_HOME=...,去掉#,然后加上本机jdk的路径) export JAVA_HOME=/home/hadoop/jdk1.7.0_79/ export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:/usr/local/hadoop/bin 并且,让环境变量配置生效source hadoop@master:/usr/local/hadoop$?source?/usr/local/hadoop/conf/hadoop-env.sh 可以显示Hadoop版本如下 hadoop@master:/usr/local/hadoop$?hadoop?version Hadoop?1.1.2 hadoop@master:/usr/local/hadoop$  

修改hadoop配置

  这里需要设定3个文件:core-site.xml  hdfs-site.xml  mapred-site.xml,都在/usr/local/hadoop/conf目录下 hadoop@master:/usr/local/hadoop$ mkdir tmp data name hadoop@master:/usr/local/hadoop/conf$ cat masters 192.168.0.221 此文件写入master节点ip,既namenode hadoop@master:/usr/local/hadoop/conf$ cat slaves 192.168.0.222 此文件写slave 节点ip既datanode节点。 1.编辑三个文件: 1). core-site.xml: <configuration> <property> <name>fs.default.name</name> <value>hdfs://192.168.0.221:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> </configuration> 2).hdfs-site.xml: <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.name.dir</name> <value>/usr/local/hadoop/name </value> </property> <property> <name>dfs.data.dir</name> <value>/usr/local/hadoop/data </value> </property> </configuration> 3). mapred-site.xml: <configuration> <property> <name>mapred.job.tracker</name> <value>192.168.0.221:9001</value> </property> </configuration>

将hadoop拷贝到slave主机

拷贝hadoop目录到slave主机相关路径,完成后并修改目录权限 hadoop@slave:/usr/local$ sudo mkdir Hadoop hadoop@master:/usr/local$ scp -r hadoop/* hadoop@slave:/usr/local/hadoop/  

启动Hadoop服务

hadoop@master:/usr/local$ source hadoop/conf/hadoop-env.sh hadoop@master:/usr/local$ cd hadoop/ hadoop@master:/usr/local/hadoop$ hadoop namenode -format 看到下面的信息就说明hdfs文件系统格式化成功了 6/12/09 10:54:40 INFO common.Storage: Storage directory /usr/local/hadoop/name? has been successfully formatted. 16/12/09 10:54:40 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/192.168.0.221 ************************************************************/ hadoop@master:/usr/local/hadoop$   启动Hadoop hadoop@master:/usr/local/hadoop$ bin/start-all.sh starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hadoop-namenode-master.out 192.168.0.222: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hadoop-datanode-slave.out 192.168.0.221: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hadoop-secondarynamenode-master.out starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hadoop-jobtracker-master.out 192.168.0.222: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hadoop-tasktracker-slave.out 出现如下列表,表明成功 hadoop@master:/usr/local/hadoop$ jps 21199 NameNode 21514 Jps 18565 JobTracker 18490 SecondaryNameNode hadoop@master:/usr/local/hadoop$ hadoop@slave:/usr/local$ jps 9743 TaskTracker 9618 DataNode 12956 Jps hadoop@slave:/usr/local$ 检查运行状态 所有的设置已完成,Hadoop也启动了,现在可以通过下面的操作来查看服务是否正常,在Hadoop中用于监控集群健康状态的Web界面: http://192.168.0.221:50030/?- Hadoop 管理介面 http://192.168.0.221:50060/?- Hadoop Task Tracker 状态 http://192.168.0.221:50070/?- Hadoop DFS 状态 测试wordcount 至此,hadoop的分布模式已经安装成功,于是运行一下hadoop自带的例子WordCount来感受以下MapReduce过程: 这时注意程序是在文件系统dfs运行的,创建的文件也都基于文件系统: hadoop@master:/usr/local/hadoop$ hadoop dfs -mkdir input hadoop@master:/usr/local/hadoop$ 将conf中的文件拷贝到dfs中的input hadoop@master:/usr/local/hadoop$ hadoop dfs -copyFromLocal conf/* input 查看文件 hadoop@master:/usr/local/hadoop$ hadoop dfs -ls /user/hadoop/input Found 17 items -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/capacity-scheduler.xml -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/configuration.xsl -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/core-site.xml -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/core-site.xml_bak -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/fair-scheduler.xml -rw-r--r--?? 1 hadoop supergroup???????? ?0 2016-12-09 11:02 /user/hadoop/input/hadoop-env.sh -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/hadoop-metrics2.properties -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/hadoop-policy.xml -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/hdfs-site.xml -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/log4j.properties -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/mapred-queue-acls.xml -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/mapred-site.xml -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/masters -rw-r--r--?? 1 hadoop supergroup ?????????0 2016-12-09 11:02 /user/hadoop/input/slaves -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/ssl-client.xml.example -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/ssl-server.xml.example -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:02 /user/hadoop/input/taskcontroller.cfg 运行WordCount hadoop@master:/usr/local/hadoop$ hadoop jar hadoop-examples-1.1.2.jar wordcount input output 16/12/09 11:25:12 INFO input.FileInputFormat: Total input paths to process : 16 16/12/09 11:25:12 INFO util.NativeCodeLoader: Loaded the native-hadoop library 16/12/09 11:25:12 WARN snappy.LoadSnappy: Snappy native library not loaded 16/12/09 11:25:12 INFO mapred.JobClient: Running job: job_201612091123_0001 16/12/09 11:25:13 INFO mapred.JobClient:? map 0% reduce 0% 16/12/09 11:25:34 INFO mapred.JobClient:? map 25% reduce 0% 16/12/09 11:25:39 INFO mapred.JobClient:? map 31% reduce 0% 16/12/09 11:25:40 INFO mapred.JobClient:? map 37% reduce 0% 16/12/09 11:25:44 INFO mapred.JobClient:? map 43% reduce 12% 16/12/09 11:25:45 INFO mapred.JobClient:? map 50% reduce 12% 16/12/09 11:25:50 INFO mapred.JobClient:? map 62% reduce 12% 16/12/09 11:25:53 INFO mapred.JobClient:? map 62% reduce 16% 16/12/09 11:26:11 INFO mapred.JobClient:? map 100% reduce 100% 16/12/09 11:26:13 INFO mapred.JobClient: Job complete: job_201612091123_0001 16/12/09 11:26:13 INFO mapred.JobClient: Counters: 29 16/12/09 11:26:13 INFO mapred.JobClient:?? Job Counters 16/12/09 11:26:13 INFO mapred.JobClient:???? Launched reduce tasks=1 16/12/09 11:26:13 INFO mapred.JobClient:???? SLOTS_MILLIS_MAPS=95372 省略 显示输出结果 hadoop@master:/usr/local/hadoop$ hadoop dfs -ls output/* -rw-r--r--?? 1 hadoop supergroup????????? 0 2016-12-09 11:26 /user/hadoop/output/_SUCCESS drwxr-xr-x?? - hadoop supergroup????????? 0 2016-12-09 11:25 /user/hadoop/output/_logs/history -rw-r--r--?? 1 hadoop supergroup????? 15826 2016-12-09 11:26 /user/hadoop/output/part-r-00000 当Hadoop结束时,可以通过stop-all.sh脚本来关闭Hadoop的守护进程    

posted on 2016-12-09 11:13  歪歪121  阅读(59)  评论(0)    收藏  举报