笔记四:hadoop之HDFS 分布式文件系统
搭建HDFS HA环境
1、集群的规划
| NN-1 | NN-2 | DN | ZK | ZKFC | JNN | |
| hadoop01 | * | * | * | |||
| hadoop02 | * | * | * | * | * | |
| hadoop03 | * | * | * | |||
| hadoop04 | * | * |
2、HA状态下,当处于active状态的节点挂掉之后,standby状态的节点自动的接替任务,转为active状态,对外提供服务
当挂掉的节点重新恢复之后,他不会再恢复成active状态,保持为standby状态
3、配置免密钥的时候,需配置hadoop02 ===》hadoop01的密钥,hadoop02转为active状态之后,无法正常与hadoop01进行链接
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
ssh-copy-id -i ~/.ssh/id_dsa.pub root@hadoop01
4、配置JDK
5、配置文件
hdfs-site.xml
<property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop01:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop02:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop01:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop02:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/var/hadoop/ha/jn</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_dsa</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop/cluster</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop02:2181,hadoop03:2181,hadoop04:2181</value>
</property>
</configuration>
slaves
hadoop02 hadoop03 hadoop04
6、把修改完的hadoop拷贝到hadoop02,hadoop03,hadoop04环境
scp -r hadoop-ha/ root@hadoop02:/opt/
7、zookeeper集群环境搭建 省略 ,并且启动zookeeper服务
8、启动journalnode,分别在hadoop01、hadoop02、hadoop03 节点上
./hadoop-daemon.sh start journalnode
9、随便找一台NN节点进行格式化和启动
./hdfs namenode -format
./hadoop-daemon.sh start namenode
10、另一台NN节点启动服务
./hdfs namenode -bootstrapStandby
11、在第一台NN节点上执行 执行zk格式化
./hdfs zkfc -formatZK
12、关闭所有节点上的进程
./stop-dfs.sh
13、启动HDFS 服务,通过jps查看对应的节点是否启动对应的角色
./start-dfs.sh
14、打开 http://192.168.20.132:50070/ 查看那个active NN 节点


15、杀死active NN 节点,看standby能否变为active
(需要安装 sudo yum install psmisc 这个组件)悲剧中。。。。。。。。。。。

16、操作HDFS文件系统
./hdfs dfs -mkdir -p /user/root #创建目录
./hdfs dfs -D dfs.blocksize=1048576 -put /opt/jdk-8u171-linux-x64.tar.gz #上传文件
开发环境的搭建
浙公网安备 33010602011771号