一、创建索引

比如，有数据表:

create table employees(

name string,

salary float,

subordinates array<string>,

deductions map<string, float>,

address struct<street:string, city:string, state:string, zip:int>

)

partitioned by (country string, state:string);

对分区字段country建立索引:

create index employees_index on table employees (country) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'

with deferred rebuild

idxproperties('creator' = 'me', 'create_at' = 'some_time')

in table employees_index_table

partitioned by (country, name)

comment 'Employees indexed by country and name.';

如果我们完全省略掉partitioned by语句的话，那么索引将会包含原始表的所有分区。
as...语句指定了索引处理器，也就是一个实现了索引接口的Java类。
并非一定要求索引处理器在一种新表中保留索引数据，但是如果需要的话，会使用到in table...语句。这个句式提供了和创建其他类型表一样的很多功能。也可以在comment语句前增加row format、 stored as、 stored by、location等语句。
目前，除了S3中的数据，对外部表和视图都是可以建立索引的。

Bitmap索引
Hive v0.8.0版本中新增了一个内置的bitmap索引处理器，bitmap索引普遍应用于排重后值较少的列。下面是对前面的例子使用bitmap索引处理器重写后的语句:

create index employees_index

on table employees(country)

as 'BITMAP'

with deferred rebuild

idxproperties('creator' = 'me', 'created_at'='some_time')

in table employees_index_table

partitioned by (country, name)

comment 'Employees indexed by country and name.';

二、重建索引

如果指定了deferred rebuild，那么，新索引将呈现空白状态。在任何时候，都可以进行第一次索引创建或者使用alter index对索引进行重建:

alter index employees_index

on table employees

partition (country = 'US')

rebuild;

如果省略掉partition，那么将会对所有分区进行重建索引。
如果重建索引失败，那么在重建开始之前，索引将停留在之前的版本状态。

三、显示索引

show formatted index on employees;

关键字formatted是可选的。增加这个关键字可以使输出中包含有列名称。用户还可以替换index 为indexes，这样输出中就可以列举出索引信息了。

四、删除索引

如果有索引的话，删除一个索引将会删除这个索引表：

drop index if exists employees_index on table employees;

如果被索引的表被删除了，那么其对应的索引和索引表也会被删除。同样的，如果原始表的某个分区被删除了，那么这个分区对应的分区索引也同时会被删除掉。

posted on 2019-11-08 15:48 xibuhaohao 阅读(516) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

一、创建索引

二、重建索引

三、显示索引

四、删除索引

公告