Hadoop生态 - 002 伪分布搭建
1,配置core-site.xml
vi /opt/module/hadoop-3.1.2/etc/hadoop/core-site.xml
修改为以下内容:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop001:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-3.1.2/data/tmp</value>
</property>
</configuration>
2,配置hdfs-site.xml
vi /opt/module/hadoop-3.1.2/etc/hadoop/hdfs-site.xml 修改为以下内容: <configuration> <!-- 数据冗余数 --> <property> <name>dfs.replication</name> <value>1</value> </property> <!--3.0之前的版本端口是50070端口--> <!-- 配置http访问地址,如果不配置就是9870端口 --> <property> <name>dfs.http.address</name> <value>0.0.0.0:9870</value> </property> </configuration>
3,配置yarn-site.xml
vi /opt/module/hadoop-3.1.2/etc/hadoop/yarn-site.xml 修改为以下内容: <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>reducer获取数据的方式</description> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop001</value> <description>指定YARN的ResourceManager的地址</description> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> <description>开启application 日志聚合功能</description> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> <description>设置聚合日志保存时间7天</description> </property> </configuration>
4,配置mapred-site.xml
vim /opt/module/hadoop-3.1.2/etc/hadoop/mapred-site.xml 修改为以下内容: <configuration> <!--指定运行mapreduce的环境是yarn--> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!--配置历史服务器的地址--> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop001:10020</value> </property> <!--配置历史服务器页面的地址--> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop001:19888</value> </property> </configuration>
5,配置hadoop-env.sh
/opt/module/hadoop-3.1.2/etc/hadoop/hadoop-env.sh vim /opt/module/hadoop-3.1.2/etc/hadoop/yarn-env.sh # export JAVA_HOME= 修改为 export JAVA_HOME=/opt/module/jdk1.8.0_251
6,格式化
hdfs namenode -format #6-1:为什么要格式化? NameNod可主要被用来管理整个分布式文件系统的命名空间(实际上就是目录和文件)的元数据信息,同时为了保证数据的可靠性, 还加入了操作日志,所以,NameNode会持久化这些数据(保存到本地的文件系统中)。 对于第一次使用HDFS,在启动Namelode时,需要先执行-format命令,然后才能正常启动NameNode节点的服务。
7,启动
cd /opt/module/hadoop-3.1.2/sbin ./start-all.sh ./stop-all.sh 访问hdfs的web地址: http://宿主机IP:9870/dfshealth.html#tab-overview 或者 http://宿主机IP:50070/dfshealth.html#tab-overview 访问yarn的web地址: http://宿主机IP:8088/cluster
使用jps查看是否正常
8,错误解决
报错一: ERROR: Attempting to operate on hdfs namenode as root ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation. Starting datanodes ERROR: Attempting to operate on hdfs datanode as root ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation. Starting secondary namenodes [hadoop_001] ERROR: Attempting to operate on hdfs secondarynamenode as root ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation. Starting resourcemanager ERROR: Attempting to operate on yarn resourcemanager as root ERROR: but there is no YARN_RESOURCEMANAGER_USER defined. Aborting operation. Starting nodemanagers ERROR: Attempting to operate on yarn nodemanager as root ERROR: but there is no YARN_NODEMANAGER_USER defined. Aborting operation. 解决方案: vi /opt/module/hadoop-3.1.2/sbin/start-dfs.sh 头部增加: vi /opt/module/hadoop-3.1.2/sbin/stop-dfs.sh 头部增加: HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs HDFS_NAMENODE_USER=root HDFS_SECONDARYNAMENODE_USER=root YARN_RESOURCEMANAGER_USER=root YARN_NODEMANAGER_USER=root vi /opt/module/hadoop-3.1.2/sbin/start-yarn.sh 头部增加: vi /opt/module/hadoop-3.1.2/sbin/stop-yarn.sh 头部增加: YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root 错误二(/opt/module/hadoop-3.1.2/logs/日志文件报错): java.lang.IllegalArgumentException: Does not contain a valid host:port authority: hadoop_001:9000 修改hostname,不能包含 . _ / hostnamectl set-hostname 主机名 同时修改:core-site.xml yarn-site.xml mapred-site.xml 然后再格式化:hdfs namenode -format 再做第7步骤
浙公网安备 33010602011771号