Hadoop HA集群配置

Hadoop HA集群配置

在HA集群中不使用secondary namenode节点被HA机制代替

一、虚拟机准备

通过VMware准备五台虚拟机,机器要配置java环境,必须为jdk8不然不兼容,节点通过虚拟机克隆创建就行

假设这五台机器名称为:fx-Master、fx-Primary、fx-Secondary、fx-Slave-01、fx-Slave-02

集群划分:

虚拟机名称

namenode

resourcemanager

datenode

nodemanager

fx-Master

fx-Primary

fx-Secondary

fx-Slave-01

fx-Slave-02

二、前置内容

1.所有机器安装net-tools来查看当前机器的IP

sudo apt install net-tools

2.修改所有机器名称,将原有的主机名称修改至上述节点分配名称

sudo vim /etc/hostname

3.所有机器修改节点IP映射

sudo vim /etc/hosts

修改至如下内容,根据IP修改

127.0.0.1 localhost

192.168.17.129 fx-Master

192.168.17.138 fx-Primary

192.168.17.130 fx-Secondary

192.168.17.132 fx-Slave-01

192.168.17.133 fx-slave-02

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

根据以下命令检验,是否修改完成

ping fx-Master -c 3 # 只ping 3次,否则要按 Ctrl+c 中断

ping fx-Primary -c 3

ping fx-Secondary -c 3

ping fx-Slave-01 -c 3

ping fx-slave-02 -c 3

4.安装SSH、配置本机SSH无密码登陆

sudo apt-get install openssh-server

SSH登陆本机:

ssh localhost

此时会有如下提示(SSH首次登陆提示),输入 yes 。然后按提示输入密码 hadoop,这样就登陆到本机了。

截图.png

但这样登陆是需要每次输入密码的,我们需要配置成SSH无密码登陆,再使用 ssh localhost 检验

exit # 退出刚才的 ssh localhost

cd ~/.ssh/ # 若没有该目录,请先执行一次ssh localhost

ssh-keygen -t rsa # 会有提示,都按回车就可以

cat ./id_rsa.pub >> ./authorized_keys # 加入授权

SSH无密码登陆节点

cd ~/.ssh # 如果没有该目录,先执行一次ssh localhost

rm ./id_rsa* # 删除之前生成的公匙(如果有)

ssh-keygen -t rsa # 一直按回车就可以

让 Master 节点需能无密码 SSH 本机,在 Master 节点上执行,使用 ssh localhost 检验成功后执行 

exit 返回原来的终端

cat ./id_rsa.pub >> ./authorized_keys

只需要将namenode的节点即(fx-Master、fx-Primary)的密钥发送至其他节点

fx-Master:

scp ~/.ssh/id_rsa.pub fx@fx-Primary:/home/fx/

scp ~/.ssh/id_rsa.pub fx@fx-Secondary:/home/fx/

scp ~/.ssh/id_rsa.pub fx@fx-Slave-01:/home/fx/

scp ~/.ssh/id_rsa.pub fx@fx-Slave-02:/home/fx/

fx-Primary:

scp ~/.ssh/id_rsa.pub fx@fx-Secondary:/home/fx/

scp ~/.ssh/id_rsa.pub fx@fx-Slave-01:/home/fx/

scp ~/.ssh/id_rsa.pub fx@fx-Slave-02:/home/fx/

节点接收

cat ~/id_rsa.pub >> ~/.ssh/authorized_keys

rm ~/id_rsa.pub # 用完就可以删掉

检验

ssh fx-Primary

三、zookeeper安装与配置

1.下载

https://www.apache.org/dyn/closer.lua/zookeeper/

2.安装

sudo tar -zxf ~/下载/apache-zookeeper-3.8.1-bin.tar.gz -C /usr/local # 解压到/usr/local中

cd /usr/local/

sudo mv ./apache-zookeeper-3.8.1-bin/ ./zookeeper # 将文件夹名改为hadoop

sudo chown -R fx ./zookeeper # 修改文件权限

3.进入conf目录,复制zoo_sample.cfg一份zoo.cfg,修改配置文件

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/usr/local/zookeeper/data

dataLogDir=/usr/local/zookeeper/logs

clientPort=2181

4.修改环境变量

vim ~/.bashrc

export ZOOKEEPER_HOME=/usr/local/zookeeper/

export PATH=$ZOOKEEPER_HOME/bin:$PATH

source ~/.bashrc

5.相关命令

启动停止

zkServer.sh start

zkServer.sh stop

启动后链接 zookeeper 服务

zkCli.sh -server 127.0.0.1:2181

成功如下:

截图.png

6.zookeeper自启动配置

创建自启动文件

sudo vim /etc/systemd/system/zookeeper.service

写入以下内容

[Unit]

Description=Zookeeper Daemon

Documentation=http://zookeeper.apache.org

Requires=network.target

After=network.target

[Service]

Type=forking

WorkingDirectory=/usr/local/zookeeper

User=zookeeper

Group=zookeeper

ExecStart=/usr/local/zookeeper/bin/zkServer.sh start /usr/local/zookeeper/conf/zoo.cfg

ExecStop=/usr/local/zookeeper/bin/zkServer.sh stop /usr/local/zookeeper/conf/zoo.cfg

ExecReload=/usr/local/zookeeper/bin/zkServer.sh restart /usr/local/zookeeper/conf/zoo.cfg

TimeoutSec=30

Restart=on-failure

[Install]

WantedBy=default.target

保存退出

重启服务system server

sudo systemctl daemon-reload

启动zookeeper 服务并且设置为开机启动

systemctl start zookeeper

systemctl enable zookeeper

验证服务的状态

systemctl status zookeeper

如果你看到高亮的 active (running) 则说明服务成功启动

7.配置集群模式的 zookeeper

cd /usr/local/zookeeper/data

sudo vim zookeeper myid

文件内容就是此服务的id

1

循环上一个步骤,以此给每个服务器创建一个唯一的id,例如fx-Master:1、fx-Primary:2、......

在zoo.cfg增加内容

server.1=fx-Master:2888:3888

server.2=fx-Primary:2888:3888

server.3=fx-Secondary:2888:3888

server.4=fx-Slave-01:2888:3888

server.5=fx-Slave-02:2888:3888

8.将zookeeper分发至其他节点

cd /usr/local

tar -zcf ~/zookeeper.tar.gz ./zookeeper # 先压缩再复制

cd ~

scp ./zookeeper.tar.gz fx-Primary:/home/fx

9.节点解压即可,但要根据zoo.cfg修改myid文件的内容

sudo tar -zxf /zookeeper.tar.gz -C /usr/local # 解压到/usr/local中

cd /usr/local/

sudo chown -R fx ./zookeeper

10.服务重启

systemctl restart zookeeper.service

若没有配置自启动,

zkServer.sh restart

四、hadoop安装与配置

1.下载

https://hadoop.apache.org/releases.html

2.安装

sudo tar -zxf ~/下载/hadoop-2.10.1.tar.gz -C /usr/local # 解压到/usr/local中

cd /usr/local/

sudo mv ./hadoop-2.10.1/ ./hadoop # 将文件夹名改为hadoop

sudo chown -R hadoop ./hadoop # 修改文件权限

3.检验

cd /usr/local/hadoop

./bin/hadoop version

4.修改环境变量

# Hadoop

export HADOOP_HOME=/usr/local/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

export PATH=$PATH:$HADOOP_HOME/sbin

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH

5.修改配置文件

cd /usr/local/hadoop/etc/hadoop/

1.core-site.xml

<configuration>

<!-- 指定hdfs的nameservice为ns -->

<property>

<name>fs.defaultFS</name>

<value>hdfs://ns</value>

</property>

<!--指定hadoop数据临时存放目录-->

<property>

<name>hadoop.tmp.dir</name>

<value>file:/usr/local/hadoop/tmp</value>

<description>Abase for other temporary directories.</description>

</property>

<property>

<name>io.file.buffer.size</name>

<value>131072</value>

</property>

<property>

<name>fs.checkpoint.period</name>

<value>60</value>

</property>

<property>

<name>fs.checkpoint.size</name>

<value>67108864</value>

</property>

<property>

<name>ha.zookeeper.quorum</name>

<value>fx-Master:2181,fx-Primary:2181,fx-Secondary:2181,fx-Slave-01:2181,fx-Slave-02:2181</value>

</property>

<!-- hadoop链接zookeeper的超时时长设置 -->

<property>

<name>ha.zookeeper.session-timeout.ms</name>

<value>3000</value>

<description>ms</description>

</property>

</configuration>

2.hdfs-site.xml

<configuration>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

<!--指定hdfs的nameservice为ns,需要和core-site.xml中的保持一致 -->

<property>

<name>dfs.nameservices</name>

<value>ns</value>

</property>

<!-- ns下面有两个NameNode,分别是fx-Master,fx-Primary -->

<property>

<name>dfs.ha.namenodes.ns</name>

<value>fx-Master,fx-Primary</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/usr/local/hadoop/tmp/dfs/name</value>

<final>true</final>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/usr/local/hadoop/tmp/dfs/data</value>

<final>true</final>

</property>

<!-- namenode-->

<!-- RPC通信地址 -->

<property>

<name>dfs.namenode.rpc-address.ns.fx-Master</name>

<value>fx-Master:9000</value>

</property>

<property>

<name>dfs.namenode.rpc-address.ns.fx-Primary</name>

<value>fx-Primary:9000</value>

</property>

<!-- http通信地址 -->

<property>

<name>dfs.namenode.http-address.ns.fx-Master</name>

<value>fx-Master:50070</value>

</property>

<property>

<name>dfs.namenode.http-address.ns.fx-Primary</name>

<value>fx-Primary:50070</value>

</property>

<!-- secondarynode 在Hadoop HA集群上不用配置,HA代替了它 -->

<!--

<property>

<name>dfs.namenode.secondary.http-address.ns.fx-Master</name>

<value>fx-Master:50090</value>

</property>

<property>

<name>dfs.namenode.secondary.http-address.ns.fx-Primary</name>

<value>fx-Primary:50090</value>

</property>

<property>

<name>dfs.namenode.secondary.http-address.ns.fx-Secondary</name>

<value>fx-Secondary:50090</value>

</property>

-->

<!-- 启用webhdfs -->

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

<property>

<name>dfs.support.append</name>

<value>true</value>

</property>

<!-- 开启NameNode故障时自动切换 -->

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>true</value>

</property>

<!-- 配置失败自动切换实现方式 -->

<property>

<name>dfs.client.failover.proxy.provider.ns</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<!--配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->

<property>

<name>dfs.ha.fencing.methods</name>

<value>shell(/bin/true)</value>

</property>

<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://fx-Master:8485;fx-Primary:8485;fx-Secondary:8485;fx-Slave-01:8485;fx-Slave-02:8485/ns</value>

</property>

<!-- 指定JournalNode在本地磁盘存放数据的位置 -->

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/usr/local/hadoop/tmp/dfs/journal</value>

</property>

<!-- 使用sshfence隔离机制时需要ssh免登陆 -->

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>~/.ssh/id_rsa</value>

</property>

<property>

<name>dfs.ha.fencing.ssh.connect-timeout</name>

<value>30000</value>

</property>

<property>

<name>ha.failover-controller.cli-check.rpc-timeout.ms</name>

<value>300000</value>

</property>

</configuration>

3.mapred-site.xml

复制一份mapred-site.xml.template重命名mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>fx-Master:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>fx-Master:19888</value>

</property>

</configuration>

4.yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

<!-- 开启RM高可用 -->

<property>

<name>yarn.resourcemanager.ha.enabled</name>

<value>true</value>

</property>

<!-- 指定RM的cluster id 标识集群。由选民使用,以确保RM不会接替另一个群集的活动状态。-->

<property>

<name>yarn.resourcemanager.cluster-id</name>

<value>yrc</value>

</property>

<!-- 指定RM的名字 -->

<property>

<name>yarn.resourcemanager.ha.rm-ids</name>

<value>rm1,rm2,rm3</value>

</property>

<!-- 分别指定RM的地址 -->

<property>

<name>yarn.resourcemanager.hostname.rm1</name>

<value>fx-Master</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm2</name>

<value>fx-Primary</value>

</property>

<property>

<name>yarn.resourcemanager.hostname.rm3</name>

<value>fx-Slave-01</value>

</property>

<!-- http地址-->

<property>

<name>yarn.resourcemanager.webapp.address.rm1</name>

<value>fx-Master:8088</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address.rm2</name>

<value>fx-Primary:8088</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address.rm3</name>

<value>fx-Slave-01:8088</value>

</property>

<!-- 指定zk集群地址 -->

<property>

<name>yarn.resourcemanager.zk-address</name>

<value>fx-Master:2181,fx-Primary:2181,fx-Slave-01:2181</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.application.classpath</name>

<value>/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/share/hadoop/yarn:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*</value>

</property>

<property>

<name>yarn.nodemanager.resource.memory-mb</name>

<value>16384</value>

</property>

<property>

<name>yarn.nodemanager.resource.cpu-vcores</name>

<value>4</value>

</property>

<property>

<name>yarn.scheduler.maximum-allocation-vcores</name>

<value>4</value>

</property>

<property>

<name>yarn.log-aggregation-enable</name>

<value>true</value>

</property>

<property>

<name>yarn.log-aggregation.retain-seconds</name>

<value>86400</value>

</property>

<!-- 启用自动恢复 -->

<property>

<name>yarn.resourcemanager.recovery.enabled</name>

<value>true</value>

</property>

<!-- 制定resourcemanager的状态信息存储在zookeeper集群上 -->

<property>

<name>yarn.resourcemanager.store.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

</property>

<!--

<property>

<name>yarn.resourcemanager.hostname</name>

<value>fx-Master</value>

</property>

-->

<!-- 如果设置,将覆盖yarn.resourcemanager.hostname中设置的主机名 -->

<!--

<property>

<name>yarn.resourcemanager.address</name>

<value>fx-Master:18040</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>fx-Master:18030</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>fx-Master:18088</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>fx-Master:18025</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>fx-Master:18141</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

-->

</configuration>

5.hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_321

6.slaves

fx-Primary

fx-Secondary

fx-Slave-01

fx-Slave-02

6.分发至各节点

cd /usr/local

tar -zcf ~/hadoop.master.tar.gz ./hadoop # 先压缩再复制

cd ~

scp ./hadoop.master.tar.gz fx-Primary:/home/fx

7.节点接收解压安装

sudo tar -zxf ~/下载/hadoop-master.tar.gz -C /usr/local

cd /usr/local/

sudo mv ./hadoop-master/ ./hadoop

sudo chown -R fx ./hadoop

8.首次启动命令

1.首先启动各个节点的Zookeeper,在各个节点上执行以下命令:

zkServer.sh start

2.初始化zookeeper

在fx-Master机器上进行zookeeper的初始化,其本质工作是创建对应的zookeeper节点

bin/hdfs zkfc -formatZK

3.在每个journalnode节点用如下命令启动journalnode

三台机器执行以下命令启动journalNode,用于我们的元数据管理

hadoop-daemon.sh start journalnode

4.初始化journalNode

fx-Master机器上准备初始化namenode

hdfs namenode -format

然后进入/usr/local/hadoop/tmp/dfs/journal/ns目录,删除里面的全部内容

fx-Master机器上准备初始化journalNode

hdfs namenode -initializeSharedEdits -force

5.启动namenode

bin/hdfs namenode

6.同步数据(备用节点)

在备用namenode节点执行以下命令,这个是把备用namenode节点的目录格式化并把元数据从主namenode节点copy过来,并且这个命令不会把journalnode目录再格式化了!

hdfs namenode -bootstrapStandby

7.结束

fx-Primary同步完数据后,在zhw1按Ctrl+C结束namenode进程,然后关闭所有Journalnode

sbin/hadoop-daemon.sh stop journalnode

8.日常启动关闭

所有机器启动zookeeper

zkServer.sh start

fx-Master开启hadoop

mr-jobhistory-daemon.sh start historyserver

stop-dfs.sh

stop-yarn.sh

zkServer.sh stop

fx-Master关闭hadoop

stop-yarn.sh

stop-dfs.sh

mr-jobhistory-daemon.sh start historyserver

所有机器关闭zookeeper

zkServer.sh stop

五、检验

查看节点

hdfs dfsadmin -report

jps查看相关进程

jps

六、基础使用

hdfs dfs -mkdir -p /user/fx

hdfs dfs -mkdir input

hdfs dfs -put /usr/local/hadoop/etc/hadoop/*.xml input

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep input output 'dfs[a-z.]+'

hdfs dfs -cat output/*

posted @ 2024-06-11 14:14  3088577529  阅读(35)  评论(0)    收藏  举报