准备工作
1.修改主机名和ip地址
1 vi /etc/sysconfig/network //修改主机名 2 或vi /etc/hostname 3 vi /etc/sysconfig/network-scripts/ifcfg-ensXX //修改ip
2.创建hadoop用户并给予root权限
useradd hadoop //创建hadoop用户 vi /etc/sudoers //给予root权限 ---------------------------------------------------- ## Allow root to run any commands anywhere root ALL=(ALL) ALL hadoop ALL=(ALL) ALL
----------------------------------------------------
passwd hadoop //给用户创建密码
sudo hadoop //切换至hadoop用户
3.在/opt目录下创建文件夹用于存放安装包
sudo mkdir -p /opt/software module chown hadoop:hadoop /opt/software module //更改目录的属组和属主 ------- 安装包上传至sowfware目录下 hadoop-2.7.2.tar.gz jdk-8u131-linux-x64.tar.gz
4.解压jdk并配置环境变量
tar -zxvf jdk-8u131-linux-x64.tar.gz -C /opt/module //解压jdk至module目录 sudo vi /etc/profile //配置JAVA_HOME和PATH变量 -------------------------------------------------------------------- ##JAVA_HOME## export JAVA_HOME=/opt/module/jdk1.8.0_131 export PATH=$PATH:$JAVA_HOME/bin -------------------------------------------------------------------- source /etc/profile java -version //查看jdk版本 echo $PATH // 查看PATH变量是否配置成功
安装Hadoop
cd /opt/software //进入存放tar包的目录 tar -zxvf hadoop-2.7.2.tar.gz -C ../module/ //解压 sudo vi /etc/profile //配置全局变量 ------------------------------------------------------------ ##HADOOP_HOME## export HADOOP_HOME=/opt/module/hadoop-2.7.2 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin ------------------------------------------------------------ source /etc/profile //使配置文件生效 hadoop //测试
sudo vi etc/hadoop/hadoop-env.sh //配置JAVA_HOME --------------------------------------------------------- # The java implementation to use. export JAVA_HOME=/opt/module/jdk1.8.0_131 ---------------------------------------------------------
伪分布式运行Hadoop
配置伪分布式 sudo vi hadoop-2.7.2/etc/hadoop/core-site.xml -----------------------------------------------------namenode的地址 <property> <name>fs.defaultFS</name> <value>hdfs://hadoop1:8020</value> </property> | | 主机名 端口号 ----------------------------------------------------- sudo vi hadoop-2.7.2/etc/hadoop/hdfs-site.xml ----------------------------------------------------- <property> <name>dfs.replication</name> <value>1</value> </property> 配置HDFS副本系数,默认为3 -----------------------------------------------------
hdfs namenode -format //格式化HDFS,第一次必须操作
hadoop-2.7.2/sbin/hadoop-daemon.sh start namenode //启动名称节点
hadoop-2.7.2/sbin/hadoop-daemon.sh start datanode //启动数据节点
启动完成后可用jps命令查看启动情况,也可用web登录 192.168.1.100:50070
查看日志:cat hadoop-2.7.2/logs/*.log
伪分布式运行hadoop案例
在HDFS上运行MAPREDUCE程序
hadoop fs -‘BASH命令’ ‘HDFS上的路径’ //hdfs操作命令格式 hadoop fs -ls -R / //查看文件系统上跟路径下的目录 hadoop fs -mkdir -p /user/zz/input hadoop fs -put input/ /user/zz/input hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/zz/input/ /user/zz/output hadoop fs -cat /user/zz/output/*
在YARN上运行MAPREDUCE程序
vi /opt/module/hadoop-2.7.2/etc/hadoop/yarn-env.sh ----------------------------------------------------------------- # some Java parameters export JAVA_HOME=/opt/module/jdk1.8.0_131 //配置JAVAHOME ----------------------------------------------------------------- vi /opt/module/hadoop-2.7.2/etc/hadoop/mapred-env.sh //配置JAVAHOME vi /opt/module/hadoop-2.7.2/etc/hadoop/yarn-site.xml ----------------------------------------------------------------- <property> //nodemanager选择获取数据的方式 <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> //RM的主机名。resourcemanager的地址(安装在那个设备上) <name>yarn.resourcemanager.hostname</name> <value>hadoop1</value> </property> ----------------------------------------------------------------- cd /opt/module/hadoop-2.7.2/etc/hadoop/ mv mapred-site.xml.template mapred-site.xml //重命名模版文件 vi mapred-site.xml ----------------------------------------------------------------- <!-- 指定MR运行在yarn上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> ----------------------------------------------------------------- hadoop-daemon.sh start namenode hadoop-daemon.sh start datanode yarn-daemon.sh start resourcemanager yarn-daemon.sh start nodemanager jus hadoop fs -rmr /user/zz/output hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/zz/input /user/zz/output 主机名/IP:8088 yarn WEB页面
更改数据存放目录
(默认存放在/tmp下)
cd /opt/module/hadoop-2.7.2/ mkdir -p data/tmp //创建新的存放目录 vi etc/hadoop/core-site.xml //更改存放目录 ----------------------------------------------------- <property> <name>hadoop.tmp.dir</name> <value>/opt/module/hadoop-2.7.2/data/tmp</value> </property> ----------------------------------------------------- 重启服务使配置文件重新加载 hadoop-daemon.sh stop namenode yarn-daemon.sh stop resourcemanager hadoop-daemon.sh stop datanode yarn-daemon.sh stop nodemanager cd /tmp //删除以前的数据 rm -rf hadoop-hadoop1 rm -rf hadoop-hadoop1-datenode.pid cd /opt/module/hadoop-2.7.2/logs/ rm -rf * hdfs namenode -format //格式化hdfs hadoop-daemon.sh start namenode yarn-daemon.sh start resourcemanager hadoop-daemon.sh start datanode yarn-daemon.sh start nodemanager
浙公网安备 33010602011771号