5.Hive整合Hbase
0.系统版本信息
OS:Debian-8.2 Jdk:1.8.0_181 hadoop:2.8.4 zookeeper:3.4.10 hbase:1.4.6 hive:2.3.3
主机信息
192.168.74.131 master 192.168.74.133 slave1 192.168.74.134 slave2 192.168.74.135 slave3
1.修改hive配置文件,需要hadoop,zookeeper,hbase全都启动,创建相关的测试表
A:hive-site.xml文件添加配置
<property>  
    <name>hbase.zookeeper.quorum</name>  
    <value>master,slave1,slave2,slave3</value>  
    <description></description>  
</property>
B:在hive安装目录bin下执行./hive
cd /home/hadoop/opt/hive-2.3.3/bin ./hive #或者执行 /home/hadoop/opt/hive-2.3.3/bin/hive
C:创建hbase识别的表
hive>
CREATE TABLE hbase_table_1(key int, value string)      
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'    
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")     
TBLPROPERTIES ("hbase.table.name" = "table_1");  

D:hbase shell:list发现有新创建的table_1

2.hbase表不存在,然后通过hive创建关联表,插入数据进行测试
主要有两种方式
1:通过hive插入数据,在hbase中可以查到相关数据
2:通过hbase插入数据,在hive中可以查询到
A:在hive中创建临时表导入数据
hive> create table ccc(foo int,bar string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
B:创建数据文件data.txt
touch /home/hadoop/software/data.txt vim data.txt 1 zhangsan 2 lisi 3 wangwu
C:将数据导入临时的hive表中,数据字段之间是tab键分隔,数据条数之间使用换行符分隔
hive>load data local inpath '/home/hadoop/software/data.txt' overwrite into table ccc;

D:将临时表中的数据导入到hbase中
hive> insert overwrite table hbase_table_1 select * from ccc where foo=1;

E:在hbas-shell中查看数据是否插入成功

这样就说明了插入通过hive插入的数据实际上最后保存到了habse的表中去了
下面证明的通过hbase插入的记录也可以通过hive查询出来,在hbase-shell中插入数据
hbase>put 'table_1','4','cf1:val','zhaoliu' hbase>scan 'table_1'

hive>select * from hbase_table_1;

3.hbase表存在,然后通过hive创建关联表,插入数据进行测试
A:在hbase-shell中创建新的表,并插入新的数据
hbase>create 'student','info' hbase>put "student",'1','info:name','tom' hbase>put "student",'2','info:name','lily' hbase>put "student",'3','info:name','wwn' hbase>scan 'student'

B:创建hive关联关系表读取hbase中的数据
hive>CREATE EXTERNAL TABLE hbase_table_2(key int, value string)      
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'    
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "info:name")     
TBLPROPERTIES("hbase.table.name" = "student"); 

4.hive和hbase多列多族的问题
下面在hbase中创建这样一张表customer
| rowkey | address | info | contact | |||
| province | city | country | age | company | phone | |
| zhangsan | hubei | wuhan | china | 24 | douyu | 101 | 
| lisi | guangdong | guangzhou | china | 24 | netease | 102 | 
hbase>create 'customer','address','info', 'contact' hbase>put 'customer','zhangsan','contact:phone','101' hbase>put 'customer','zhangsan','address:province','hubie' hbase>put 'customer','zhangsan','address:city','wuhan' hbase>put 'customer','zhangsan','address:country','china' hbase>put 'customer','zhangsan','info:age','24' hbase>put 'customer','zhangsan','info:company','douyu' hbase>put 'customer','lisi','contact:phone','102' hbase>put 'customer','lisi','address:province','guangdong' hbase>put 'customer','lisi','address:city','guangzhou' hbase>put 'customer','lisi','address:country','china' hbase>put 'customer','lisi','info:age','24' hbase>put 'customer','lisi','info:company','netease'

CREATE EXTERNAL TABLE `hbase_customer`( key string, province string, city string, country string, age int, company string, phone string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,address:province, address:city, address:country, info:age, info:company,contact:phone") TBLPROPERTIES( "hbase.table.name" = "customer");

    http://www.cnblogs.com/makexu/


 
                
            
         
         浙公网安备 33010602011771号
浙公网安备 33010602011771号