1、Hadoop.hdfs单点

 存储模型
以下 用户没有感知
1
文件线性按字节切割成块;具有id和offset 2 文件与文件的大小可以不一样 3 一个文件出最后一个block,其他block大小一致 4 blook的大小依据硬件IO特性调整 5 block被分割存放在集群中的节点,具有location 6 block具有副本,没有主从概念,副本不能出现在同一个节点 7 副本是满足可靠性和性能的关键 8 文件上传可以指定block的代销个副本数,上传后只能修改副本数 9 一次写入多次读取,不支持修改block级别的数据; 10 支持追加数据
架构设计
-  NameNode 
-  DataNode
-  Sencordary NameNode
NameNode负责管理和存储文件元数据,并维护了一个层次型的文件目录树
关于文件block与实际机器之间的映射;是在DataNode与NameNode之间维持心跳时;主动汇报的

NameNode数据持久化

FsImage + EditLog

快照+日志; 由SNN定期从NameNode拉取Image并将日志写入快照,再将快照推送会NameNode  ;增量添加

 

 

1 第一个副本:放置在上传文件的DN;如果是集群外提交,则随机挑选一台磁盘不太慢,cou不太忙的节点
2 第二个副本:放置在与第一个副本不同的机架的节点上
3 第三个副本:与第二个副本相同机架的节点
4 更多副本:随机节点
BLOCK放置策略

 

安全模式
 1 HDFS搭建时会格式化,格式化操作会产生一个FsImage以及当前集群的一些信息
 2 当NameNode启动时,会从硬盘中读取EditLog和FsImage
 3 将所有的EditLog中的事务 作用于加载到内存中的FsImage
 4 并将这个新版本的FsImage保存到本地磁盘上
 5 然后删除旧的EditLog
 6 NameNode启动后会进入一个称为安全模式的特殊状态
 7 处于安全模式的NameNode是不会进行数据块的复制的
 8 NameNode从所有的DataNode接收心跳信号和块状态报告
 9 每当NameNode检测确认某个数据块的副本数目达到这个最小值,该数据块就会被人为是副本安全
10 在一定百分比的数据块被NameNode检测确认之后(加上额外的30s)NameNode将退出安全模式状态
11 接下来它会确定还有哪些数据块的副本没有达到指定数据,并将这些数据块复制到其他DataNode上

 

hdfs读流程

 1 了降低整体的宽带消耗和读取延时,HDFS会尽量让读取程序读取离它最近的副本。
 2 如果在读取程序的同一个机架上有一个副本,那么就读取该副本
 3 如果一个HDFS集群跨越多个数据中心,那么客户端也将首先读取本地数据中心的副本
 4 语义:下载一个文件
 5   1.Client和NN交互文件的元数据获取FileBlockLocation
 6   2.NN会按照距离策略排序返回
 7   3.Client尝试下载block并返回校验数据完整性
 8 语义:下载一个文件其实是获取文件的所有block的元数据,那么子集获取某些block应该成立
 9   1.DHFS支持client给出文件的offset 自定义连接哪些block的DN,自定义获取数据这个是支持计算层的分治、并行计算的核心。

 

 

 

 

hdfs写流程

 

 1 Client和NN连接创建文件元数据
 2 NN判断元数据是否有效
 3 NN触发副本放置策略,返回一个有序的DN列表
 4 Client和DN建立Pipline连接
 5 Client将块切分成packet(64kb),并使用chunk(512b) + chucksum(4b)
 6 Client将packet放入发送队列dataqueue 并向第一个DN发送
 7 第一个DN收到packet后本地保存并发送给第二个DN
 8 第二个DN收到packet后本地保存并发送给第三个DN
 9 这一个过程中,上游节点同时发送下一个packet
10 HDFS使用这种传输方式,副本数对于client是透明的
11 当block传输完成,DN各自向NN汇报,同时client继续传输下一个block
12 所以,client的传输和block的汇报也是并行的

 

 

 

安装教程

 

准备:
CentOS-7-x86_64-DVD-1908.iso
Parallels Desktop 14.1.3

 


 

 

 

 


 

 

 

echo "node01">> /etc/hostname

echo  -e '
DEVICE="eth0"
IPV6INIT="yes"
BOOTPROTO="static"
ONBOOT="yes"
DNS1=114.114.114.114
IPADDR=192.168.0.201
GATEWAY=192.168.0.1 '> /etc/sysconfig/network-scripts/ifcfg-eth0

service iptables stop
systemctl stop firewalled echo
-e ' SELINUX=disabled SELINUXTYPE=targeted '> /etc/selinux/config
chkconfig iptables off
wget http://mirrors.aliyun.com/repo/Centos-7.repo > /etc/yum.repos.d/CentOS-7.repo yum makecache yum install ntp -y echo 'server ntp1.aliyun.com' >> /etc/ntp.conf yum install rsync java-devel java -y su ace cd ~ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys chmod 700 -R ~/.ssh chmod 600 ~/.ssh/authorized_keys
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

ssh-copy-id -i ~/.ssh/id_rsa.pub ace@node02

 

 

 

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/jre    # (yum安装的;ll查找)
export HADOOP_HOME=/var/bigdata/hadoop-3.1.3
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

# 配置hadoop-env.sh
cp $HADOOP_HOME/etc/hadoop/hadoop-env.sh $HADOOP_HOME/etc/hadoop/hadoop-env.sh.bak

echo  -e "export JAVA_HOME=$JAVA_HOME
export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}
"> $HADOOP_HOME/etc/hadoop/hadoop-env.sh    此处的java_home是绝对路径

# 必须给hadoop配置javahome;不然ssh过去时找不到
cp $HADOOP_HOME/etc/hadoop/hdfs-site.xml $HADOOP_HOME/etc/hadoop/hdfs-site.xml.bak
echo -e '<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>dfs.replaction</name>
        <value>1</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/var/bigdata/hadoop/local/dfs/name</value>
    </property>
    <property>
        <name>dfs.namenode.name.secondary.http-address</name>
        <value>node1:50090</value>
    </property>
    <property>
        <name>dfs.http.address</name>
        <value>0.0.0.0:50070</value>
    </property>
    <property>
        <name>dfs.namenode.checkpoint.dir</name>
        <value>/var/bigdata/hadoop/local/dfs/namesecondary</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/var/bigdata/hadoop/local/dfs/data</value>
    </property>
<property>
        <name>hadoop.http.staticuser.user</name>
<value>*</value>
    </property>
    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>
</configuration>

' > $HADOOP_HOME/etc/hadoop/hdfs-site.xml
# 目录会自动创建
cp $HADOOP_HOME/etc/hadoop/core-site.xml $HADOOP_HOME/etc/hadoop/core-site.xml.bak
echo -e '<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node01:9000</value>
    </property>
</configuration>
' > $HADOOP_HOME/etc/hadoop/core-site.xml
# 后期  spark通过这个地址获取数据

echo -e 'node01
' > $HADOOP_HOME/etc/hadoop/slaves


2.x之后端口默认8020

 

 

[root@node01 hadoop-3.1.3]# tree /var/bigdata/ -d -L 3
/var/bigdata/
└── hadoop-3.1.3
    ├── bin
    ├── etc
    │   └── hadoop
    ├── include
    ├── lib
    │   └── native
    ├── libexec
    │   ├── shellprofile.d
    │   └── tools
    ├── logs
    ├── sbin
    │   └── FederationStateStore
    └── share
        ├── doc
        └── hadoop

16 directories
[root@node01 hadoop-3.1.3]# hdfs namenode -format
2020-06-25 15:08:19,747 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = node01/192.168.0.201
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.1.3

STARTUP_MSG:   java = 1.8.0_252
************************************************************/
2020-06-25 15:08:19,754 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2020-06-25 15:08:19,822 INFO namenode.NameNode: createNameNode [-format]
2020-06-25 15:08:20,078 INFO common.Util: Assuming 'file' scheme for path /var/bigdata/hadoop/local/dfs/name in configuration.
2020-06-25 15:08:20,078 INFO common.Util: Assuming 'file' scheme for path /var/bigdata/hadoop/local/dfs/name in configuration.
Formatting using clusterid: CID-5e758c42-c5b0-45a4-8180-7bae30092eef
2020-06-25 15:08:20,106 INFO namenode.FSEditLog: Edit logging is async:true
2020-06-25 15:08:20,116 INFO namenode.FSNamesystem: KeyProvider: null
2020-06-25 15:08:20,117 INFO namenode.FSNamesystem: fsLock is fair: true
2020-06-25 15:08:20,117 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2020-06-25 15:08:20,125 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)
2020-06-25 15:08:20,125 INFO namenode.FSNamesystem: supergroup          = supergroup
2020-06-25 15:08:20,125 INFO namenode.FSNamesystem: isPermissionEnabled = true
2020-06-25 15:08:20,125 INFO namenode.FSNamesystem: HA Enabled: false
2020-06-25 15:08:20,154 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2020-06-25 15:08:20,164 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2020-06-25 15:08:20,164 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2020-06-25 15:08:20,167 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2020-06-25 15:08:20,167 INFO blockmanagement.BlockManager: The block deletion will start around 2020 Jun 25 15:08:20
2020-06-25 15:08:20,168 INFO util.GSet: Computing capacity for map BlocksMap
2020-06-25 15:08:20,168 INFO util.GSet: VM type       = 64-bit
2020-06-25 15:08:20,169 INFO util.GSet: 2.0% max memory 409 MB = 8.2 MB
2020-06-25 15:08:20,169 INFO util.GSet: capacity      = 2^20 = 1048576 entries
2020-06-25 15:08:20,173 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2020-06-25 15:08:20,177 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: defaultReplication         = 3
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: maxReplication             = 512
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: minReplication             = 1
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
2020-06-25 15:08:20,178 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2020-06-25 15:08:20,193 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215
2020-06-25 15:08:20,206 INFO util.GSet: Computing capacity for map INodeMap
2020-06-25 15:08:20,206 INFO util.GSet: VM type       = 64-bit
2020-06-25 15:08:20,206 INFO util.GSet: 1.0% max memory 409 MB = 4.1 MB
2020-06-25 15:08:20,206 INFO util.GSet: capacity      = 2^19 = 524288 entries
2020-06-25 15:08:20,207 INFO namenode.FSDirectory: ACLs enabled? false
2020-06-25 15:08:20,207 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2020-06-25 15:08:20,207 INFO namenode.FSDirectory: XAttrs enabled? true
2020-06-25 15:08:20,207 INFO namenode.NameNode: Caching file names occurring more than 10 times
2020-06-25 15:08:20,210 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2020-06-25 15:08:20,212 INFO snapshot.SnapshotManager: SkipList is disabled
2020-06-25 15:08:20,214 INFO util.GSet: Computing capacity for map cachedBlocks
2020-06-25 15:08:20,214 INFO util.GSet: VM type       = 64-bit
2020-06-25 15:08:20,214 INFO util.GSet: 0.25% max memory 409 MB = 1.0 MB
2020-06-25 15:08:20,214 INFO util.GSet: capacity      = 2^17 = 131072 entries
2020-06-25 15:08:20,220 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2020-06-25 15:08:20,220 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2020-06-25 15:08:20,220 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2020-06-25 15:08:20,224 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2020-06-25 15:08:20,224 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2020-06-25 15:08:20,225 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2020-06-25 15:08:20,225 INFO util.GSet: VM type       = 64-bit
2020-06-25 15:08:20,225 INFO util.GSet: 0.029999999329447746% max memory 409 MB = 125.6 KB
2020-06-25 15:08:20,225 INFO util.GSet: capacity      = 2^14 = 16384 entries
2020-06-25 15:08:20,250 INFO namenode.FSImage: Allocated new BlockPoolId: BP-906661162-192.168.0.201-1593068900245
2020-06-25 15:08:20,266 INFO common.Storage: Storage directory /var/bigdata/hadoop/local/dfs/name has been successfully formatted.
2020-06-25 15:08:20,284 INFO namenode.FSImageFormatProtobuf: Saving image file /var/bigdata/hadoop/local/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
2020-06-25 15:08:20,339 INFO namenode.FSImageFormatProtobuf: Image file /var/bigdata/hadoop/local/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 391 bytes saved in 0 seconds .
2020-06-25 15:08:20,351 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2020-06-25 15:08:20,355 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid = 0 when meet shutdown.
2020-06-25 15:08:20,355 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node01/192.168.0.201
************************************************************/
[root@node01 hadoop-3.1.3]# tree /var/bigdata/ -d -L 3
/var/bigdata/
├── hadoop
│   └── local
│       └── dfs
└── hadoop-3.1.3
    ├── bin
    ├── etc
    │   └── hadoop
    ├── include
    ├── lib
    │   └── native
    ├── libexec
    │   ├── shellprofile.d
    │   └── tools
    ├── logs
    ├── sbin
    │   └── FederationStateStore
    └── share
        ├── doc
        └── hadoop

19 directories
[root@node01 hadoop-3.1.3]# cat /var/bigdata/hadoop/local/dfs/name/current/VERSION
#Thu Jun 25 15:08:20 CST 2020
namespaceID=868789271
clusterID=CID-5e758c42-c5b0-45a4-8180-7bae30092eef
cTime=1593068900245
storageType=NAME_NODE
blockpoolID=BP-906661162-192.168.0.201-1593068900245
layoutVersion=-64

 


 


[ace@node01 ~]$ start-dfs.sh

 

Starting namenodes on [node01]
Starting datanodes
Starting secondary namenodes [node01]


http://node1:50070  有集群的UI


配置文件
hadoop-env.sh       配置JAVAHOME
core-site.xml         UI
hdfs-site.xml         系统的一些配置
slaves    工作节点,导致哪些机器上会有data目录

 

 

基本命令
可以使用hdfs  dfs查看语法
hdfs dfs -mkdir /user/root
hdfs dfs -put 123.txt /user/root

hdfs dfs -D dfs.blocksize=1024*1024       -put data.txt       [path 默认推送到用户家目录]   
#!coding:utf-8
import os
from hdfs.client import Client, InsecureClient


# HADOOP_USER_NAME=ACE

# 关于python操作hdfs的API可以查看官网:
# https://hdfscli.readthedocs.io/en/latest/api.html
# 读取hdfs文件内容,将每行存入数组返回
def read_hdfs_file(client, filename):
    # with client.read('samples.csv', encoding='utf-8', delimiter='\n') as reader:
    #  for line in reader:
    # pass
    lines = []
    with client.read(filename, encoding='utf-8', delimiter='\n') as reader:
        for line in reader:
            # pass
            # print line.strip()
            lines.append(line.strip())
    return lines


# 创建目录
def mkdirs(client, hdfs_path):
    client.makedirs(hdfs_path)


# 删除hdfs文件
def delete_hdfs_file(client, hdfs_path):
    client.delete(hdfs_path, recursive=True)


# 上传文件到hdfs
def put_to_hdfs(client, local_path, hdfs_path):
    client.upload(hdfs_path, local_path, cleanup=True)


# 从hdfs获取文件到本地
def get_from_hdfs(client, hdfs_path, local_path):
    client.download(hdfs_path, local_path, overwrite=False)


# 追加数据到hdfs文件
def append_to_hdfs(client, hdfs_path, data):
    client.write(hdfs_path, data, overwrite=False, append=True)


# 覆盖数据写到hdfs文件
def write_to_hdfs(client, hdfs_path, data):
    client.write(hdfs_path, data, overwrite=True, append=False)


# 移动或者修改文件
def move_or_rename(client, hdfs_src_path, hdfs_dst_path):
    client.rename(hdfs_src_path, hdfs_dst_path)


# 返回目录下的文件
def list(client, hdfs_path):
    return client.list(hdfs_path, status=False)


url = "http://node01:50070"
client = InsecureClient(url, user='ace')
# client = Client(url)

# move_or_rename(client,'/input/2.csv', '/input/emp.csv')
# read_hdfs_file(client,'/input/emp.csv')
# put_to_hdfs(client,'/home/shutong/hdfs/1.csv','/input/')
# append_to_hdfs(client,'/input/emp.csv','我爱你'+'\n')
# write_to_hdfs(client,'/input/emp.csv','我爱你'+'\n')
# read_hdfs_file(client,'/input/emp.csv')
# move_or_rename(client,'/input/emp.csv', '/input/2.csv')
# mkdirs(client,'/input/python')
delete_hdfs_file(client, '/tmp')
print(list(client, '/'))
# chown(client,'/input/1.csv', 'root')

 

 

 

 

思考:
1、分布式文件系统 和  FSData OutputStream 要在客户节点上安装么?
2、Namenode  返回列表  由谁调用, 会不会被恶意篡改;
3、与程序是如何交互的
4、Client Node和DataNode的区别
一般业务型程序都是在DataNode ;即client Node是DataNode

    5、还是以 ______?_____身份登录到其他机器;是由启动hdfs的身份所决定的

 

 1 [root@node01 ~]# start-dfs.sh 
 2 Starting namenodes on [node01]
 3 ERROR: Attempting to operate on hdfs namenode as root
 4 ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
 5 Starting datanodes
 6 ERROR: Attempting to operate on hdfs datanode as root
 7 ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
 8 Starting secondary namenodes [node01]
 9 ERROR: Attempting to operate on hdfs secondarynamenode as root
10 ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
root启动的问题

 

如果由root  启动start-dfs.sh,需要额外配置;
因此在此教程中使用ace用户;  ---> 使用ace用户登录到其他机器的ace用户
要配置免密

  6、NN的单点故障; 内存压力

 

 

安装遇到的问题
1 hadoop3: mkdir: cannot create directory `/usr/local/hadoop/bin/../logs': Permission denied
chown -R  ace:ace hadoop-3.1.3/

 

 

1、当部分datanode宕机,会进入  做一些自我修复
2、当丢失的数据块比例超过一定比例

 

hdfs dfsadmin -safemode leave 强制退出安全模式

hdfs dfsadmin -safemode enter

hdfs dfsadmin -safemode get

hdfs dfsadmin -safemode wait

 

timeout (超时时长) = 10 * 心跳时长 + 2* 检查心跳机制是否正常的时间
心跳时长 3秒
检查心跳机制是否是正常工作的时间  5 分钟

=630

 

 

 

 

请看下一篇

 
posted @ 2020-06-25 15:17  慕沁  阅读(302)  评论(0)    收藏  举报