continue 解读HIVE SQL 执行计划
背景
若干SQL执行在HIVE上,需要查看特定的执行计划,理解内部的运行机制,笔者以现有的背景做为理解案例:
案例
1 hive> explain 2 > select count(1) from ( 3 > select s_age 4 > from student_tb_txt 5 > group by s_age 6 > ) b;
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-2 depends on stages: Stage-1
Stage-0 depends on stages: Stage-2
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: student_tb_txt
Statistics: Num rows: 4798993 Data size: 3359295744 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: s_age (type: bigint)
outputColumnNames: s_age
Statistics: Num rows: 4798993 Data size: 3359295744 Basic stats: COMPLETE Column stats: NONE
Group By Operator
keys: s_age (type: bigint)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 4798993 Data size: 3359295744 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: bigint)
sort order: +
Map-reduce partition columns: _col0 (type: bigint)
Statistics: Num rows: 4798993 Data size: 3359295744 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized
Reduce Operator Tree:
Group By Operator
keys: KEY._col0 (type: bigint)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 2399496 Data size: 1679647521 Basic stats: COMPLETE Column stats: NONE
Select Operator
Statistics: Num rows: 2399496 Data size: 1679647521 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count()
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: bigint)
Execution mode: vectorized
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Time taken: 0.364 seconds, Fetched: 76 row(s)

浙公网安备 33010602011771号