ODPS零记录分区覆盖
在日常的工作中,我们会碰到insert overwrite未成功覆盖分区的问题,今天我们来想详细探讨一下这个问题。
这种情况通常是由于后面的查询语句记录数为0造成,首先我们来看两个例子。
1.单分区
先建表
create table s_aaronbu_test1( id bigint ) partitioned by (pt string) ;
插入一条记录
insert overwrite table s_aaronbu_test1 partition(pt='20160101') select 1 as id from dual;
查询记录
select * from s_aaronbu_test1;
+------------+----+
| id | pt |
+------------+----+
| 1 | 20160101 |
+------------+----+
表中现在有一条记录,现在往该分区插入0条记录动态分区进行覆盖
insert overwrite table s_aaronbu_test1 partition(pt) select 2 as id,'20160101' as pt from dual where 1=0;
......
R2_1_Stg1:
instance count: 1
run time: 0.000
instance time:
min: 0.000, max: 0.000, avg: 0.000
input records:
input: 0 (min: 0, max: 0, avg: 0)
output records:
OK
查看表中记录
select * from s_aaronbu_test1;
+------------+----+
| id | pt |
+------------+----+
| 1 | 20160101 |
+------------+----+
记录未改变,分区未变化,接下来换成指定分区的方式
insert overwrite table s_aaronbu_test1 partition(pt='20160101') select 2 as id from dual where 1=0;
......
M1_Stg1:
instance count: 1
run time: 0.000
instance time:
min: 0.000, max: 0.000, avg: 0.000
input records:
input: 1 (min: 1, max: 1, avg: 1)
output records:
M1_Stg1FS_12375901: 0 (min: 0, max: 0, avg: 0)
OK
查看表中记录
select * from s_aaronbu_test1;
+------------+----+
| id | pt |
+------------+----+
+------------+----+
表中无数据,分区成功覆盖。
由此可以看出,单分区,动态分区不会被零记录分区覆盖,输出记录为空;指定分区会被零记录分区覆盖,输出记录为0.
2.多分区
先建表
create table s_aaronbu_test2( id bigint ) partitioned by (pt string,type bigint) ;
插入一条记录
insert overwrite table s_aaronbu_test2 partition(pt='20160101',type) select 1 as id,1 as type from dual;
查询记录
select * from s_aaronbu_test2;
+------------+----+------+
| id | pt | type |
+------------+----+------+
| 1 | 20160101 | 1 |
+------------+----+------+
表中现在有一条记录,现在往该分区插入0条记录动态分区进行覆盖
insert overwrite table s_aaronbu_test2 partition(pt='20160101',type) select 2 as id,1 as type from dual where 1=0;
......
R2_1_Stg1:
instance count: 1
run time: 0.000
instance time:
min: 0.000, max: 0.000, avg: 0.000
input records:
input: 0 (min: 0, max: 0, avg: 0)
output records:
OK
查看表中记录
select * from s_aaronbu_test2;
+------------+----+------+
| id | pt | type |
+------------+----+------+
| 1 | 20160101 | 1 |
+------------+----+------+
记录未改变,分区未变化,接下来换成指定分区的方式
insert overwrite table s_aaronbu_test2 partition(pt='20160101',type=1) select 2 as id from dual where 1=0;
......
M1_Stg1:
instance count: 1
run time: 1.000
instance time:
min: 1.000, max: 1.000, avg: 1.000
input records:
input: 1 (min: 1, max: 1, avg: 1)
output records:
M1_Stg1FS_12382646: 0 (min: 0, max: 0, avg: 0)
OK
查看表中记录
select * from s_aaronbu_test2;
+------------+----+------+
| id | pt | type |
+------------+----+------+
+------------+----+------+
表中无数据,分区成功覆盖。
由此可以看出,多分区的覆盖规则和单分区一致。
综上所述,无论单分区还是多分区,动态分区不会被零记录分区覆盖,输出记录为空;指定分区会被零记录分区覆盖,输出记录为0.

浙公网安备 33010602011771号