5.Hive整合Hbase

0.系统版本信息

OS:Debian-8.2
Jdk:1.8.0_181
hadoop:2.8.4
zookeeper:3.4.10
hbase:1.4.6
hive:2.3.3

主机信息

192.168.74.131  master
192.168.74.133  slave1
192.168.74.134  slave2
192.168.74.135  slave3

1.修改hive配置文件,需要hadoop,zookeeper,hbase全都启动,创建相关的测试表

A:hive-site.xml文件添加配置
<property>  
    <name>hbase.zookeeper.quorum</name>  
    <value>master,slave1,slave2,slave3</value>  
    <description></description>  
</property>
B:在hive安装目录bin下执行./hive
cd /home/hadoop/opt/hive-2.3.3/bin
./hive

#或者执行
/home/hadoop/opt/hive-2.3.3/bin/hive
C:创建hbase识别的表
hive>
CREATE TABLE hbase_table_1(key int, value string)      
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'    
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")     
TBLPROPERTIES ("hbase.table.name" = "table_1");  

D:hbase shell:list发现有新创建的table_1

2.hbase表不存在,然后通过hive创建关联表,插入数据进行测试

主要有两种方式

  1:通过hive插入数据,在hbase中可以查到相关数据

  2:通过hbase插入数据,在hive中可以查询到

A:在hive中创建临时表导入数据
hive> create table ccc(foo int,bar string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile; 
 
B:创建数据文件data.txt
touch /home/hadoop/software/data.txt
vim data.txt
1   zhangsan
2   lisi
3   wangwu
C:将数据导入临时的hive表中,数据字段之间是tab键分隔,数据条数之间使用换行符分隔
hive>load data local inpath '/home/hadoop/software/data.txt' overwrite into table ccc; 

D:将临时表中的数据导入到hbase中
hive> insert overwrite table hbase_table_1 select * from ccc where foo=1;

E:在hbas-shell中查看数据是否插入成功

这样就说明了插入通过hive插入的数据实际上最后保存到了habse的表中去了

下面证明的通过hbase插入的记录也可以通过hive查询出来,在hbase-shell中插入数据

hbase>put 'table_1','4','cf1:val','zhaoliu'
hbase>scan 'table_1'

hive>select * from hbase_table_1;

3.hbase表存在,然后通过hive创建关联表,插入数据进行测试

A:在hbase-shell中创建新的表,并插入新的数据
hbase>create 'student','info'
hbase>put "student",'1','info:name','tom'
hbase>put "student",'2','info:name','lily'
hbase>put "student",'3','info:name','wwn'
hbase>scan 'student'

B:创建hive关联关系表读取hbase中的数据
hive>CREATE EXTERNAL TABLE hbase_table_2(key int, value string)      
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'    
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "info:name")     
TBLPROPERTIES("hbase.table.name" = "student"); 

4.hive和hbase多列多族的问题

下面在hbase中创建这样一张表customer

rowkey address info contact
province city country age company phone
zhangsan hubei wuhan china 24 douyu 101
lisi guangdong guangzhou china 24 netease 102

 

 

 

 

hbase>create 'customer','address','info', 'contact'

hbase>put 'customer','zhangsan','contact:phone','101'
hbase>put 'customer','zhangsan','address:province','hubie' 
hbase>put 'customer','zhangsan','address:city','wuhan' 
hbase>put 'customer','zhangsan','address:country','china' 
hbase>put 'customer','zhangsan','info:age','24' 
hbase>put 'customer','zhangsan','info:company','douyu' 

hbase>put 'customer','lisi','contact:phone','102'
hbase>put 'customer','lisi','address:province','guangdong' 
hbase>put 'customer','lisi','address:city','guangzhou' 
hbase>put 'customer','lisi','address:country','china' 
hbase>put 'customer','lisi','info:age','24' 
hbase>put 'customer','lisi','info:company','netease'

CREATE EXTERNAL TABLE `hbase_customer`(
  key string, 
  province string, 
  city string, 
  country string, 
  age int, 
  company string, 
  phone string )      
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'    
WITH SERDEPROPERTIES (
  "hbase.columns.mapping" = ":key,address:province, address:city, address:country, info:age, info:company,contact:phone")     
TBLPROPERTIES(
  "hbase.table.name" = "customer"); 

 

posted @ 2017-09-11 18:00  桃源仙居  阅读(143)  评论(0)    收藏  举报