【技术】HDFS体系架构

 HDFS-NameNode:名称节点
(1)职责:对HDFS的节点进行管理,作为主节点的管理员
        接收客户端(命令行、java)的请求:创建目录、上传数据、下载数据和删除数据。
        管理和维护HDFS的日志和元信息。
(2)dfs/name下的文件夹
    a、current:主要存放日志和元信息
    (存贮路径:/opt/moudle/hadoop-2.7.3/tmp/dfs/name/current)


      edits文件:二进制文件,体现了HDFS的最新状态
      [root@bigdata121 current]# cd /opt/moudle/hadoop-2.7.3/tmp/dfs/name/current/
      [root@bigdata121 current]# ll
                                总用量 3116
                                -rw-r--r--. 1 root root      42 5月   7 11:51 edits_0000000000000000001-0000000000000000002
                                -rw-r--r--. 1 root root 1048576 5月   7 11:51 edits_0000000000000000003-0000000000000000003
                                -rw-r--r--. 1 root root      42 5月   7 13:13 edits_0000000000000000004-0000000000000000005
                                -rw-r--r--. 1 root root      42 5月   7 15:20 edits_0000000000000000006-0000000000000000007
                                -rw-r--r--. 1 root root      42 5月   7 16:20 edits_0000000000000000008-0000000000000000009
                                -rw-r--r--. 1 root root      42 5月   7 17:20 edits_0000000000000000010-0000000000000000011
                                -rw-r--r--. 1 root root 1048576 5月   7 17:20 edits_0000000000000000012-0000000000000000012
                                -rw-r--r--. 1 root root 1048576 5月   9 09:56 edits_inprogress_0000000000000000013
                                -rw-r--r--. 1 root root     351 5月   7 17:20 fsimage_0000000000000000011
                                -rw-r--r--. 1 root root      62 5月   7 17:20 fsimage_0000000000000000011.md5
                                -rw-r--r--. 1 root root     351 5月   9 09:56 fsimage_0000000000000000012
                                -rw-r--r--. 1 root root      62 5月   9 09:56 fsimage_0000000000000000012.md5
                                -rw-r--r--. 1 root root       3 5月   9 09:56 seen_txid
                                -rw-r--r--. 1 root root     205 5月   9 09:56 VERSION
      [root@bigdata121 current]# hdfs oev -i edits_inprogress_0000000000000000013 -o ~/a.xml
                                (o:表示 offline,inprogress:表示最新的。)
      [root@bigdata121 current]# cat ~/a.xml
                                <?xml version="1.0" encoding="UTF-8"?>
                                <EDITS>
                                  <EDITS_VERSION>-63</EDITS_VERSION>
                                  <RECORD>
                                    <OPCODE>OP_START_LOG_SEGMENT</OPCODE>
                                    <DATA>
                                      <TXID>13</TXID>
                                    </DATA>
                                  </RECORD>
                                </EDITS>
      [root@bigdata121 current]# hdfs dfs -mkdir /input
      [root@bigdata121 current]# hdfs oev -i edits_inprogress_0000000000000000013 -o ~/a.xml
      [root@bigdata121 current]# cat ~/a.xml
                                <?xml version="1.0" encoding="UTF-8"?>
                                <EDITS>
                                  <EDITS_VERSION>-63</EDITS_VERSION>
                                  <RECORD>
                                    <OPCODE>OP_START_LOG_SEGMENT</OPCODE>
                                    <DATA>
                                      <TXID>13</TXID>
                                    </DATA>
                                  </RECORD>
                                  <RECORD>
                                    <OPCODE>OP_MKDIR</OPCODE>
                                    <DATA>
                                      <TXID>14</TXID>
                                      <LENGTH>0</LENGTH>
                                      <INODEID>16386</INODEID>
                                      <PATH>/input</PATH>
                                      <TIMESTAMP>1557368168847</TIMESTAMP>
                                      <PERMISSION_STATUS>
                                        <USERNAME>root</USERNAME>
                                        <GROUPNAME>supergroup</GROUPNAME>
                                        <MODE>493</MODE>
                                      </PERMISSION_STATUS>
                                    </DATA>
                                  </RECORD>
                                </EDITS>
      [root@bigdata121 current]# ll
                                总用量 3120
                                -rw-r--r--. 1 root root      42 5月   7 11:51 edits_0000000000000000001-0000000000000000002
                                -rw-r--r--. 1 root root 1048576 5月   7 11:51 edits_0000000000000000003-0000000000000000003
                                -rw-r--r--. 1 root root      42 5月   7 13:13 edits_0000000000000000004-0000000000000000005
                                -rw-r--r--. 1 root root      42 5月   7 15:20 edits_0000000000000000006-0000000000000000007
                                -rw-r--r--. 1 root root      42 5月   7 16:20 edits_0000000000000000008-0000000000000000009
                                -rw-r--r--. 1 root root      42 5月   7 17:20 edits_0000000000000000010-0000000000000000011
                                -rw-r--r--. 1 root root 1048576 5月   7 17:20 edits_0000000000000000012-0000000000000000012
                                -rw-r--r--. 1 root root     114 5月   9 10:21 edits_0000000000000000013-0000000000000000015
                                -rw-r--r--. 1 root root 1048576 5月   9 10:21 edits_inprogress_0000000000000000016
                                -rw-r--r--. 1 root root     351 5月   9 09:56 fsimage_0000000000000000012
                                -rw-r--r--. 1 root root      62 5月   9 09:56 fsimage_0000000000000000012.md5
                                -rw-r--r--. 1 root root     428 5月   9 10:21 fsimage_0000000000000000015
                                -rw-r--r--. 1 root root      62 5月   9 10:21 fsimage_0000000000000000015.md5
                                -rw-r--r--. 1 root root       3 5月   9 10:21 seen_txid
                                -rw-r--r--. 1 root root     205 5月   9 09:56 VERSION
    b、元信息文件fsimage:记录的数据块的位置信息和数据块冗余信息,没有体现HDFS的最新状态,二进制文件
      [root@bigdata121 current]# hdfs oiv -i fsimage_0000000000000000015 -o ~/b.xml -p XML
      [root@bigdata121 current]# cat ~/b.xml
                                <?xml version="1.0"?>
                                <fsimage><NameSection>
                                <genstampV1>1000</genstampV1><genstampV2>1000</genstampV2><genstampV1Limit>0</genstampV1Limit><lastAllocatedBlockId>1073741824</lastAllocatedBlockId><txid>15</txid></NameSection>
                                <INodeSection><lastInodeId>16386</lastInodeId><inode><id>16385</id><type>DIRECTORY</type><name></name><mtime>1557368168847</mtime><permission>root:supergroup:rwxr-xr-x</permission><nsquota>9223372036854775807</nsquota><dsquota>-1</dsquota></inode>
                                <inode><id>16386</id><type>DIRECTORY</type><name>input</name><mtime>1557368168847</mtime><permission>root:supergroup:rwxr-xr-x</permission><nsquota>-1</nsquota><dsquota>-1</dsquota></inode>
                                </INodeSection>
                                <INodeReferenceSection></INodeReferenceSection><SnapshotSection><snapshotCounter>0</snapshotCounter></SnapshotSection>
                                <INodeDirectorySection><directory><parent>16385</parent><inode>16386</inode></directory>
                                </INodeDirectorySection>
                                <FileUnderConstructionSection></FileUnderConstructionSection>
                                <SnapshotDiffSection><diff><inodeid>16385</inodeid></diff></SnapshotDiffSection>
                                <SecretManagerSection><currentId>0</currentId><tokenSequenceNumber>0</tokenSequenceNumber></SecretManagerSection><CacheManagerSection><nextDirectiveId>1</nextDirectiveId></CacheManagerSection>
                                </fsimage>
    c、in_use.lock 避免同一文件被多使用,只能启动一个namenode
      (# start-all.sh 集群启动后才会产生in_use.lock文件)

HDFS-DataNode:数据节点
(1)主要用来进行数据的存储。
    Hadoop 1.x版本为64M
    Hadoop 2.x版本为128M(可以通过hdfs-site.xml文件修改blocksize)
(2)数据块的表现形式就是一个个的blk文件
(位置:/opt/moudle/hadoop-2.7.3/tmp/dfs/data/current/BP-### *933765109-10.1.255.124-1546784436341* ###/current/finalized/subdir0/subdir0)
Hadoop 3.x版本有纠删码技术,节约存储空间。

HDFS-SecondaryNameNode:第二名称节点
(1)进行日志信息的合并,根据checkpoint或者时间间隔(3600s)或者edits文件达到64M。
(2)edits文件合并到fsimage里面,edits文件可以清空。

posted @ 2019-05-10 10:26  南简  阅读(32)  评论(0)    收藏  举报