HBase学习笔记
官网地址: http://hbase.apache.org/
一、概述
HBase是一个开源的、分布式的、版本化的、非关系数据库,是一种构建在HDFS之上的分布式、面向列的存储系统。在需要对超大规模数据集进行随机、实时的读/写访问时,可以使用HBase。HBase来源于google三大论文中的Bigtable。
特点
- 容量大:一个表可以有数十亿行,上百万列,数据矩阵横向和纵向两个纬度所支持的数据量级都非常具有弹性。
- 面向列:HBase是面向列(簇)的存储和权限控制,并支持列(簇)独立检索。列式存储,其数据在表中是按照某列存储的,这样在查询只需要少数几个字段的时候,能大大减少读取的数据量。
- 数据多版本:每个单元中的数据可以有多个版本,默认情况下,版本号自动分配,版本号就是单元格插入时的时间戳。
- 结构稀疏:对于为空(NULL)的列,并不占用存储空间,因此,表可以设计的非常稀疏。
- 扩展性:底层依赖于HDFS,而HDFS扩展方便。
- 高可靠性:WAL机制保证了数据写入时不会因集群异常而导致写入数据丢失:Eeplication 机制保证了在集群出现严重的问题时,数据不会发生丢失或损坏。而且HBase底层使用HDFS,HDFS本身也有备份。
- 高性能:底层LSM数据结构(树形结构)和Rowkey有序排列等架构上的独特设计,使得HBase具有非常高的写入性能。region切分、主键索引和缓存机制使得hbase在海量数据下具备一定的随机读取性能,该性能针对Rowkey的查询能够到达毫秒级别。
- 无模式:每一行都有一个可以排序的主键和任意多的列,列可以根据需要动态增加,同一张表中不同的行可以有截然不同的列。
- 数据类型单一:HBase中的数据都是字符串,没有类型。
名词解释
- Row Key: 与 NoSQL 数据库一样,Row Key 是用来检索记录的主键。
- Column Family: 列簇,列的集合。列族是表的 Schema 的一部分(而列不是),必须在使用表之前定义。
- Timestamp: 时间戳。HBase 中通过 Row 和 Columns 确定的一个存储单元称为 Cell。每个 Cell 都保存着同一份数据的多个版本。 版本通过时间戳来索引,时间戳的类型是 64 位整型。时间戳可以由HBase(在数据写入时自动)赋值, 此时时间戳是精确到毫秒的当前系统时间。时间戳也 可以由客户显示赋值。如果应用程序要避免数据版本冲突,就必须自己生成具有唯一性的时间戳。每个 Cell 中,不同版本的数据按照时间倒序排序,即最新的数据排在最前面。
- Cell: 单元格,由 {row key,column(=< family> + < label>),version} 唯一确定的单元。Cell 中的数据是没有类型的,全部是字节码形式存储。
HBase和关系数据库区别
数据类型:HBase中的数据类型都是字符串类型(string)
数据操作:HBase只有普通的增删改查等操作,没有表之间的关联查询
存储模式:HBase是基于列式存储模式,而RDBMS是基于行式存储的
应用场景:HBase适合存储大量数据,查询效率极高
二、HBase架构
HBase架构组成
HBase采用Master/Slave架构搭建集群,它隶属于Hadoop生态系统,由以下类型节点组成: HMaster 节点、HRegionServer 节点、 ZooKeeper 集群,而在底层,它将数据存储于HDFS中,因此涉及到HDFS的NameNode、DataNode等,如下结构图:

- HMaster节点用于:
- 管理HRegionServer,实现其负载均衡。
- 管理和分配HRegion,比如在HRegion split时分配新的HRegion;在HRegionServer退出时迁移其内的HRegion到其他HRegionServer上。
- 实现DDL操作(Data Definition Language,namespace和table的增删改,column familiy的增删改等)。
- 管理namespace和table的元数据(实际存储在HDFS上)。
- 权限控制(ACL)。
- HRegionServer节点用于:
- 存放和管理本地HRegion。
- 读写HDFS,管理Table中的数据。
- Client直接通过HRegionServer读写数据(从HMaster中获取元数据,找到RowKey所在的 HRegion/HRegionServer后)。
- ZooKeeper集群用于:
- 实现HMaster主从节点的failover。
- 存放整个 HBase 集群的元数据以及集群的状态信息。
ZooKeeper
ZooKeeper为HBase集群提供协调服务,它管理着HMaster和HRegionServer的状态(available/alive等),并且会在它们宕机时通知给HMaster,从而HMaster可以实现HMaster之间的failover,或对宕机的HRegionServer中的HRegion集合的修复(将它们分配给其他的HRegionServer)。ZooKeeper集群本身使用一致性协议(PAXOS协议)保证每个节点状态的一致性。

ZooKeeper协调集群所有节点的共享信息,在HMaster和HRegionServer连接到ZooKeeper后创建Ephemeral节点,并使用Heartbeat机制维持这个节点的存活状态,如果某个Ephemeral节点实效,则HMaster会收到通知,并做相应的处理。

另外,HMaster通过监听ZooKeeper中的Ephemeral节点(默认:/hbase/rs/* )来监控HRegionServer的加入和宕机。在第一个HMaster连接到ZooKeeper时会创建Ephemeral节点(默认:/hbase/master)来表示Active的HMaster,其后加进来的HMaster则监听该Ephemeral节点,如果当前Active的HMaster宕机,则该节点消失,因而其他HMaster得到通知,而将自身转换成Active的HMaster,在变为Active的HMaster之前,它会创建在/hbase/back-masters/下创建自己的Ephemeral节点。
HMaster
HMaster没有单点故障问题,可以启动多个HMaster,通过ZooKeeper的Master Election机制保证同时只有一个HMaster出于Active状态,其他的HMaster则处于热备份状态。一般情况下会启动两个HMaster,非Active的HMaster会定期的和Active HMaster通信以获取其最新状态,从而保证它是实时更新的,因此如果启动了多个HMaster反而增加了Active HMaster的负担。HMaster主要有两方面的职责:
- 管理协调HRegionServer
- 监控集群中所有HRegionServer的状态(通过Heartbeat和监听ZooKeeper中的节点状态)。
- 管理HRegion的分配,以及负载均衡和修复时HRegion的重新分配。
- Admin职能
- 创建、删除、修改Table的定义。
HRegionServer详解
HRegionServer一般与DataNode在同一台机器上运行,实现数据的本地性。HRegionServer包含多个HRegion,由 WAL(HLog)、MemStore、BlockCache、HFile组成。
WAL即Write Ahead Log,在早期版本中称为HLog,它是HDFS上的一个文件,如其名字所表示的,所有写操作都会先保证将数据写入这个Log文件后,才会真正更新MemStore,这保证HRegionServer宕机后,我们依然可以从该Log文件中读取数据,Replay所有的操作,而不至于数据丢失。这个Log文件会定期Roll出新的文件而删除旧的文件(那些已持久化到HFile中的Log可以删除)。WAL文件存储在/hbase/WALs/${HRegionServer_Name}的目录中(在0.94之前,存储在/hbase/.logs/目录中),一般一个HRegionServer只有一个WAL实例,也就是说一个HRegionServer的所有WAL写都是串行的(就像log4j的日志写也是串行的),这当然会引起性能问题,因而在HBase 1.0之后,通过HBASE-5699实现了多个WAL并行写(MultiWAL),该实现采用HDFS的多个管道写,以单个HRegion为单位。
BlockCache是一个读缓存,即“引用局部性”原理(也应用于CPU,分空间局部性和时间局部性,空间局部性是指CPU在某一时刻需要某个数据,那么有很大的概率在下一时刻它需要的数据在其附近;时间局部性是指某个数据在被访问过一次后,它有很大的概率在不久的将来会被再次的访问),将数据预读取到内存中,以提升读的性能。HBase中提供两种BlockCache的实现:默认LruBlockCache(on-heap)和BucketCache(通常是off-heap)。通常BucketCache的性能要差于LruBlockCache,然而由于GC的影响,LruBlockCache的延迟会变的不稳定,而BucketCache由于是自己管理BlockCache,而不需要GC,因而它的延迟通常比较稳定,这也是有些时候需要选用BucketCache的原因。
HRegion是一个Table中的一个Region在一个HRegionServer中的表达。一个Table可以有一个或多个Region,他们可以在一个相同的HRegionServer上,也可以分布在不同的HRegionServer上,一个HRegionServer可以有多个HRegion,他们分别属于不同的Table。HRegion由多个Store(HStore)构成,每个HStore对应了一个Table在这个HRegion中的一个Column Family(列簇),即每个Column Family就是一个集中的存储单元,因而最好将具有相近IO特性的Column存储在一个Column Family,以实现高效读取(数据局部性原理,可以提高缓存的命中率)。HStore是HBase中存储的核心,它实现了读写HDFS功能,一个HStore由一个MemStore 和0个或多个StoreFile组成。
- MemStore是一个写缓存(In Memory Sorted Buer),所有数据的写在完成WAL日志写后,会写入MemStore中,由MemStore根据一定的算法将数据Flush到底层HDFS文件中(HFile),通常每个HRegion中的每个Column Family(HStore)有一个自己的MemStore。
- HFile(StoreFile) 用于存储HBase的数据(Cell/KeyValue)。在HFile中的数据是按RowKey、Column Family、Column排序,对相同的Cell(即这三个值都一样),则按timestamp倒序倒序排列。
HRegion
HBase使用RowKey将表水平切割成多个HRegion,从HMaster的角度,每个HRegion都纪录了它的StartKey和EndKey(第一个HRegion的StartKey为空,最后一个HRegion的EndKey为空),由于RowKey是排序的,因而Client可以通过HMaster快速的定位每个RowKey在哪个HRegion中。HRegion由HMaster分配到相应的HRegionServer中,然后由HRegionServer负责HRegion的启动和管理,和Client的通信,负责数据的读(使用HDFS)。

三、HBase环境搭建
准备工作
-
确保HDFS运行正常
HDFS环境搭建:https://blog.csdn.net/qq_43225978/article/details/86483486
-
确保ZooKeeper运行正常
ZooKeeper环境搭建:https://blog.csdn.net/qq_43225978/article/details/89028841
[root@centos bin]# jps
17394 QuorumPeerMain
4618 SecondaryNameNode
4315 NameNode
17517 Jps
4447 DataNode
安装配置启动
1. 解压HBase安装包
[root@centos hadoop]# tar -zxvf hbase-2.1.3-bin.tar.gz -C /usr/myapp/dev/hadoop/
2. 修改配置⽂件
-
配置hbase-env.sh文件
# jdk安装位置 export JAVA_HOME=/opt/modules/jdk1.7.0_67 # 使用自定义zookeeper export HBASE_MANAGES_ZK=false -
配置hbase-site.xml文件
<configuration> <property> <name>hbase.tem.dir</name> <value>/usr/myapp/dev/hadoop/hbase-2.1.3/data/tem</value> </property> <property> <name>hbase.rootdir</name> <value>hdfs://centos:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> # 多台使用逗号隔开 <value>centos</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.wal.provider</name> <value>filesystem</value> </property> </configuration> -
配置regionservers文件
centos
3. 配置环境变量
HBASE_HOME=/usr/myapp/dev/hadoop/hbase-2.1.3
HADOOP_HOME=/usr/myapp/dev/hadoop/hadoop-2.8.5
JAVA_HOME=/usr/myapp/dev/java/jdk1.8.0_171
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin
export JAVA_HOME
export CLASSPATH
export HADOOP_HOME
export HBASE_HOME
export PATH
[root@centos ~]# source /root/.bashrc
4. 启动
[root@centos ~]# start-hbase.sh
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/myapp/dev/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/myapp/dev/hadoop/hbase-2.1.3/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
running master, logging to /usr/myapp/dev/hadoop/hbase-2.1.3/logs/hbase-root-master-centos.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/myapp/dev/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/myapp/dev/hadoop/hbase-2.1.3/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
centos: running regionserver, logging to /usr/myapp/dev/hadoop/hbase-2.1.3/logs/hbase-root-regionserver-centos.out
[root@centos ~]# jps
20161 HRegionServer //负责实际表数据的读写操作
17394 QuorumPeerMain
20039 HMaster //类似namenode管理表相关元数据、管理ResgionServer
4618 SecondaryNameNode
4315 NameNode
20254 Jps
4447 DataNode
启动时发现HBase无法启动,于是查看
hbase-2.1.3/logs下日志发现如下错误:java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder2019-07-18 16:15:15,215 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster. at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:3100) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:236) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:3111) Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:644) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:628) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hadoop.hbase.util.CommonFSUtils.getRootDir(CommonFSUtils.java:362) at org.apache.hadoop.hbase.util.CommonFSUtils.isValidWALRootDir(CommonFSUtils.java:411) at org.apache.hadoop.hbase.util.CommonFSUtils.getWALRootDir(CommonFSUtils.java:387) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeFileSystem(HRegionServer.java:704) at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:613) at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:489) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:3093) ... 5 more Caused by: java.lang.ClassNotFoundException: org.apache.htrace.SamplerBuilder at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 25 more我的HBase与hadoop版本分别是Hbase2.1.3,hadoop2.8.5,从官网可以看到是相互匹配的,从大佬的博文获得解决方法:
cp $HBASE_HOME/lib/client-facing-thirdparty/htrace-core-3.1.0-incubating.jar $HBASE_HOME/lib/执行以上命令可解决
借鉴博文:
https://blog.csdn.net/woloqun/article/details/81350323
https://blog.csdn.net/dongdong9223/article/details/86508330
5. 访问
四、HBase Shell操作
1. 连接HBase Server
[root@centos ~]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/myapp/dev/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/myapp/dev/hadoop/hbase-2.1.3/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
Took 0.0031 seconds
2. 查看系统状态
hbase(main):002:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
3. 帮助
hbase(main):005:0* help
HBase Shell, version 2.1.3, rda5ec9e4c06c537213883cca8f3cc9a7c19daf67, Mon Feb 11 15:45:33 CST 2019
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: processlist, status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region, compact, compact_rs, compaction_state, flush, is_in_maintenance_mode, list_deadservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, stop_master, stop_regionserver, trace, unassign, wal_roll, zk_dump
Group name: replication
Commands: add_peer, append_peer_exclude_namespaces, append_peer_exclude_tableCFs, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_exclude_namespaces, remove_peer_exclude_tableCFs, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_serial, set_peer_tableCFs, show_peer_tableCFs, update_peer_config
Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot
Group name: configuration
Commands: update_all_config, update_config
Group name: quotas
Commands: list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota
Group name: security
Commands: grant, list_security_capabilities, revoke, user_permission
Group name: procedures
Commands: list_locks, list_procedures
Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
Group name: rsgroup
Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup
SHELL USAGE:
Quote all names in HBase Shell such as table and column names. Commas delimit
command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:
{'key1' => 'value1', 'key2' => 'value2', ...}
and are opened and closed with curley-braces. Key/values are delimited by the
'=>' character combination. Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type
'Object.constants' to see a (messy) list of all constants in the environment.
If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:
hbase> get 't1', "key\x03\x3f\xcd"
hbase> get 't1', "key\003\023\011"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
4. 操作Namespace
Namespace 概念类似于RDBMS的数据库,用来管理组织HBase的表。
- 查看hbase中的namespace
hbase(main):025:0> list_namespace NAMESPACE default hbase - 创建namespace
hbase(main):026:0> create_namespace 'xmd' Took 0.2415 seconds hbase(main):028:0> list_namespace NAMESPACE default hbase xmd 3 row(s) Took 0.0167 seconds - 查看namespace中的table
hbase(main):033:0> list_namespace_tables 'xmd' TABLE 0 row(s) Took 0.0137 seconds => [] hbase(main):034:0> list_namespace_tables 'hbase' TABLE meta namespace 2 row(s) Took 0.0261 seconds => ["meta", "namespace"] - 删除hbase中的namespace
hbase(main):035:0> drop_namespace 'xmd' Took 0.2415 seconds hbase(main):036:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.0193 seconds注 : HBase不允许删除有表的数据库
hbase(main):003:0> list_namespace_tables 'xmd'
TABLE
t_user
1 row(s)
Took 0.0305 seconds
=> ["t_user"]
...
...
hbase(main):004:0> drop_namespace 'xmd'ERROR: org.apache.hadoop.hbase.constraint.ConstraintException: Only empty namespaces can be removed. Namespace xmd has 1 tables
at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.prepareDelete(DeleteNamespaceProcedure.java:217)
at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.executeFromState(DeleteNamespaceProcedure.java:78)
at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.executeFromState(DeleteNamespaceProcedure.java:45)
at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:189)
at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:965)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1723)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1462)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1200(ProcedureExecutor.java:78)
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2039)
For usage try 'help "drop_namespace"'
Took 0.8934 seconds
5. Table相关操作
-
创建表
================== # 方式一 ================== ##### 创建的table属于指定的名为xmd的namespace hbase(main):005:0> create 'xmd:t_table','cf1' Created table xmd:t_table Took 1.2581 seconds => Hbase::Table - xmd:t_table ================== # 方式二 ================== ##### 创建的table属于指定的名为xmd的namespace,并指定各个列簇的 版本数 hbase(main):007:0> create 'xmd:t_user',{NAME=>'cf1',VERSIONS=>7},{NAME=>'cf2',VERSIONS=>5} Created table xmd:t_user Took 2.2388 seconds => Hbase::Table - xmd:t_user hbase(main):006:0> list_namespace_tables 'xmd' TABLE t_table t_user 2 row(s) Took 0.0128 seconds => ["t_table", "t_user"] ================== # 方式三 ================== ##### 创建的table默认属于default的namespace hbase(main):002:0> create 'user','cf1' Created table user Took 1.2789 seconds => Hbase::Table - user hbase(main):004:0> list_namespace_tables 'default' TABLE user 1 row(s) Took 0.0187 seconds => ["user"] -
查看表详情
hbase(main):001:0> describe 'xmd:t_user' Table xmd:t_user is ENABLED xmd:t_user COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', VERSIONS => '7', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_C ELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_ BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCK SIZE => '65536'} {NAME => 'cf2', VERSIONS => '5', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_C ELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_ BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCK SIZE => '65536'} 2 row(s) Took 0.7620 seconds其中,cf1是指列簇1,cf2是指列簇2,其他是各个列簇的信息。
-
查看所有表
hbase(main):002:0> list TABLE user xmd:t_table xmd:t_user 3 row(s) Took 0.0212 seconds => ["user", "xmd:t_table", "xmd:t_user "]该命令列出除名为hbase的namespace外的其他所有的namespace下的表
xmd:t_user 表示名为xmd的namespace下表 t_user
user 表示名为default的namespace下表 user -
修改表
hbase(main):010:0> alter 'xmd:t_user',{NAME=>'cf2',TTL=>1800} Updating all regions with the new schema... 1/1 regions updated. Done. Took 2.1569 seconds hbase(main):011:0> describe 'xmd:t_user' Table xmd:t_user is ENABLED xmd:t_user COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_C ELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_ BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCK SIZE => '65536'} {NAME => 'cf2', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_C ELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => '1800 SECONDS (30 MINUTES)' , MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACH E => 'true', BLOCKSIZE => '65536'} 2 row(s) Took 0.0247 seconds -
删除表
hbase(main):017:0> list TABLE user xmd:t_table xmd:t_user 3 row(s) Took 0.0039 seconds => ["user", "xmd:t_table", "xmd:t_user"] hbase(main):018:0> drop 'user' ERROR: Table user is enabled. Disable it first. For usage try 'help "drop"' Took 0.0142 seconds hbase(main):019:0> disable 'user' Took 0.4448 seconds hbase(main):020:0> drop 'user' Took 0.2282 seconds hbase(main):021:0> list TABLE xmd:t_table xmd:t_user 2 row(s) Took 0.0041 seconds => ["xmd:t_table", "xmd:t_user"]删除表之前先使表不可用,否则无法删除
6. 表相关操作
-
存值
hbase(main):002:0> put 'xmd:t_user',1,'cf1:name','zhangsan' Took 0.1437 seconds hbase(main):016:0> put 'xmd:t_user',1,'cf1:name','lisi' Took 0.0062 seconds hbase(main):008:0> put 'xmd:t_user',2,'cf1:name','wangwu' Took 0.0175 seconds -
浏览表内容
hbase(main):019:0> scan 'xmd:t_user' ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563765815393, value=lisi 1 column=cf2:sex, timestamp=1563764938762, value=man 2 column=cf1:name, timestamp=1563765106103, value=wangwu 2 row(s) Took 0.0097 seconds hbase(main):020:0> scan 'xmd:t_user',{COLUMNS=>'cf1'} ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563765815393, value=lisi 2 column=cf1:name, timestamp=1563765106103, value=wangwu 2 row(s) Took 0.0062 seconds hbase(main):021:0> scan 'xmd:t_user',{COLUMNS=>'cf1',LIMIT=>1} ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563765815393, value=lisi 1 row(s) Took 0.0079 seconds hbase(main):022:0> scan 'xmd:t_user',{COLUMNS=>'cf1',LIMIT=>2} ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563765815393, value=lisi 2 column=cf1:name, timestamp=1563765106103, value=wangwu 2 row(s) Took 0.0107 seconds hbase(main):023:0> scan 'xmd:t_user',{COLUMNS=>'cf1',LIMIT=>1,VERSIONS=>3} ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563765815393, value=lisi 1 column=cf1:name, timestamp=1563764596429, value=zhangsan 1 row(s) Took 0.0306 seconds hbase(main):024:0> scan 'xmd:t_user',{COLUMNS=>'cf1',LIMIT=>1,VERSIONS=>1} ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563765815393, value=lisi 1 row(s) Took 0.0112 seconds -
取值
hbase(main):026:0> get 'xmd:t_user',1 COLUMN CELL cf1:age timestamp=1563764845852, value=18 cf1:name timestamp=1563765815393, value=lisi cf2:sex timestamp=1563764938762, value=man 1 row(s) Took 0.0065 seconds hbase(main):027:0> get 'xmd:t_user',1,{COLUMNS=>'cf1',VERSIONS=>10} COLUMN CELL cf1:age timestamp=1563764845852, value=18 cf1:name timestamp=1563765815393, value=lisi cf1:name timestamp=1563764596429, value=zhangsan 1 row(s) Took 0.0147 seconds hbase(main):028:0> get 'xmd:t_user',1,{COLUMNS=>'cf1:name',VERSIONS=>10} COLUMN CELL cf1:name timestamp=1563765815393, value=lisi cf1:name timestamp=1563764596429, value=zhangsan 1 row(s) Took 0.0104 seconds hbase(main):029:0> get 'xmd:t_user',1,{COLUMNS=>'cf1:name',TIMESTAMP=>1563765815393} COLUMN CELL cf1:name timestamp=1563765815393, value=lisi 1 row(s) Took 0.0117 seconds hbase(main):030:0> get 'xmd:t_user',1,{COLUMNS=>'cf1',TIMESTAMP=>1563765815393} COLUMN CELL cf1:name timestamp=1563765815393, value=lisi 1 row(s) Took 0.0057 seconds -
删除表中值
hbase(main):031:0> delete 'xmd:t_user',1,'cf1:name' Took 0.0195 seconds hbase(main):032:0> scan 'xmd:t_user' ROW COLUMN+CELL 1 column=cf1:age, timestamp=1563764845852, value=18 1 column=cf1:name, timestamp=1563764596429, value=zhangsan 1 column=cf2:sex, timestamp=1563764938762, value=man 2 column=cf1:name, timestamp=1563765106103, value=wangwu 2 row(s) Took 0.0074 seconds hbase(main):033:0> get 'xmd:t_user',1,{COLUMNS=>'cf1:name',VERSIONS=>10} COLUMN CELL cf1:name timestamp=1563764596429, value=zhangsan 1 row(s) Took 0.0118 seconds hbase(main):034:0> deleteall 'xmd:t_user',1 Took 0.0088 seconds hbase(main):035:0> scan 'xmd:t_user' ROW COLUMN+CELL 2 column=cf1:name, timestamp=1563765106103, value=wangwu 1 row(s) Took 0.0079 seconds
五、JAVA API操作
1. maven依赖
<!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-client -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>2.1.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-common -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>2.1.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-protocol -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-protocol</artifactId>
<version>2.1.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-server -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>2.1.0</version>
</dependency>
2. JAVA API 操作
package com.example.hbasedemo.util;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class HbaseConnectUtil {
private final static String QUORUM = "42.157.128.206";
private final static String CLIENTPORT = "2181";
private static Connection connection;
private static Admin admin;
/**
* 获取与Hbase的连接
*
* @return
*/
private static Connection getConnection() {
try {
if (connection == null || connection.isClosed()) {
Configuration configuration = HBaseConfiguration.create();
configuration.set("hbase.zookeeper.property.clientPort", CLIENTPORT);
configuration.set("hbase.zookeeper.quorum", QUORUM);
connection = ConnectionFactory.createConnection(configuration);
}
return connection;
} catch (IOException e) {
e.printStackTrace();
return connection;
}
}
/**
* 释放资源
*/
public static void close() {
try {
if (admin != null) {
admin.close();
}
if (connection != null) {
connection.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
/**
* 创建Namespace
*
* @param namespace namespace名
* @param map namespace描述
* @return
*/
public static boolean createNamespace(String namespace, Map<String, String> map) {
try {
// 获取Hbase操作对象
admin = getConnection().getAdmin();
NamespaceDescriptor descriptor = NamespaceDescriptor.create(namespace).build();
if (map != null) {
for (Map.Entry<String, String> entry : map.entrySet()) {
descriptor.setConfiguration(entry.getKey(), entry.getValue());
}
}
admin.createNamespace(descriptor);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 创建Table至指定namespace
*
* @param namespace 将table创建至哪个namespace下
* @param table 创建的table名
* @param columnFamily 创建的列簇
* @return
*/
public static boolean createTable(String namespace, String table, String... columnFamily) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
TableDescriptorBuilder tableDescriptorBuilder = TableDescriptorBuilder.newBuilder(tableName);
// 列簇
for (String family : columnFamily) {
ColumnFamilyDescriptor columnFamilyDescriptor = ColumnFamilyDescriptorBuilder.of(family);
tableDescriptorBuilder.setColumnFamily(columnFamilyDescriptor);
}
TableDescriptor tableDescriptor = tableDescriptorBuilder.build();
admin.createTable(tableDescriptor);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 给指定table添加列簇
*
* @param table table名
* @param columnFamily 要添加的列簇名
* @return
*/
public static boolean addColumnFaimly(String namespace, String table, String... columnFamily) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
for (String family : columnFamily) {
ColumnFamilyDescriptor columnFamilyDescriptor = ColumnFamilyDescriptorBuilder.of(family);
admin.addColumnFamily(tableName, columnFamilyDescriptor);
}
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 删除table
*
* @param table
* @return
*/
public static boolean deleteTable(String namespace, String table) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
// 判断table是否存在
if (admin.tableExists(tableName)) {
admin.disableTable(tableName);
admin.deleteTable(tableName);
return true;
}
return false;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 插入数据至指定table
*
* @param namespace 指定namespace
* @param table 指定table
* @param rowKey 添加数据rowKay
* @param columnFamily 添加数据至指定列簇
* @param column 添加数据至指定列
* @param value 添加的值
* @return
*/
public static boolean put(String namespace, String table, String rowKey, String columnFamily, String column, String value) {
try {
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Put put = new Put(Bytes.toBytes(rowKey));
put.addColumn(Bytes.toBytes(columnFamily), Bytes.toBytes(column), Bytes.toBytes(value));
Table tab = getConnection().getTable(tableName);
tab.put(put);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 批量添加一行多列数据
*
* @param namespace
* @param table
* @param put
* @return
*/
public static boolean put(String namespace, String table, Put put) {
try {
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
tab.put(put);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 批量添加多行多列数据
*
* @param namespace
* @param table
* @param puts
* @return
*/
public static boolean put(String namespace, String table, List<Put> puts) {
try {
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
tab.put(puts);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 获取table所有内容
*
* @param namespace
* @param table
* @return
*/
public static List<Object> scanTable(String namespace, String table) {
try {
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
Scan scan = new Scan();
ResultScanner resultScanner = tab.getScanner(scan);
List<Object> resultList = new ArrayList<Object>();
for (Result result : resultScanner) {
for (Cell cell : result.rawCells()) {
Map<String, Object> cellMap = new HashMap<String, Object>();
cellMap.put("RowKey", Bytes.toString(CellUtil.cloneRow(cell)));
cellMap.put("ColumFamily", Bytes.toString(CellUtil.cloneFamily(cell)));
cellMap.put("Colum", Bytes.toString(CellUtil.cloneQualifier(cell)));
cellMap.put("Value", Bytes.toString(CellUtil.cloneValue(cell)));
cellMap.put("TimeStamp", cell.getTimestamp());
resultList.add(cellMap);
}
}
return resultList;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
/**
* 通过rowKey获取数据
*
* @param namespace 指定namespace
* @param table 指定table
* @param rowKey 指定rowkey
* @return
*/
public static List<Object> get(String namespace, String table, String rowKey) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
Get get = new Get(Bytes.toBytes(rowKey));
Result result = tab.get(get);
List<Object> resultList = new ArrayList<Object>();
for (Cell cell : result.rawCells()) {
Map<String, Object> cellMap = new HashMap<String, Object>();
cellMap.put("RowKey", Bytes.toString(CellUtil.cloneRow(cell)));
cellMap.put("ColumFamily", Bytes.toString(CellUtil.cloneFamily(cell)));
cellMap.put("Colum", Bytes.toString(CellUtil.cloneQualifier(cell)));
cellMap.put("Value", Bytes.toString(CellUtil.cloneValue(cell)));
cellMap.put("TimeStamp", cell.getTimestamp());
resultList.add(cellMap);
}
return resultList;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
/**
* 通过rowKey 批量 获取数据
*
* @param namespace 指定namespace
* @param table 指定table
* @param rowKeys 指定的rowkey
* @return
*/
public static List<Object> get(String namespace, String table, List<String> rowKeys) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
List<Get> getList = new ArrayList<Get>();
for (String rowKey : rowKeys) {
Get get = new Get(Bytes.toBytes(rowKey));
getList.add(get);
}
Result[] results = tab.get(getList);
List<Object> resultList = new ArrayList<Object>();
for (Result result : results) {
for (Cell cell : result.rawCells()) {
Map<String, Object> cellMap = new HashMap<String, Object>();
cellMap.put("RowKey", Bytes.toString(CellUtil.cloneRow(cell)));
cellMap.put("ColumFamily", Bytes.toString(CellUtil.cloneFamily(cell)));
cellMap.put("Colum", Bytes.toString(CellUtil.cloneQualifier(cell)));
cellMap.put("Value", Bytes.toString(CellUtil.cloneValue(cell)));
cellMap.put("TimeStamp", cell.getTimestamp());
resultList.add(cellMap);
}
}
return resultList;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
/**
* 删除某rowKey数据
*
* @param namespace
* @param table
* @param rowKey
* @return
*/
public static boolean delete(String namespace, String table, String rowKey) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
Delete delete = new Delete(Bytes.toBytes(rowKey));
tab.delete(delete);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
/**
* 批量删除
*
* @param namespace
* @param table
* @param rowKeys
* @return
*/
public static boolean delete(String namespace, String table, List<String> rowKeys) {
try {
admin = getConnection().getAdmin();
TableName tableName;
if (namespace != null) {
tableName = TableName.valueOf(namespace + ":" + table);
} else {
tableName = TableName.valueOf(table);
}
Table tab = getConnection().getTable(tableName);
List<Delete> deleteList = new ArrayList<Delete>();
for (String rowKey : rowKeys) {
Delete delete = new Delete(Bytes.toBytes(rowKey));
deleteList.add(delete);
}
tab.delete(deleteList);
return true;
} catch (IOException e) {
e.printStackTrace();
return false;
}
}
}

浙公网安备 33010602011771号