排序
1. 全局排序 order by
对输入做全局排序,因此只有一个reducer。
select ymd,symbol,price_close from stocks oreder by symbol DESC select ymd,symbol,2*price_close as salary from stocks oreder by salary DESC select ymd,symbol,price_close from stocks oreder by ymd,symbol DESC
2. 分区排序 sort by
对每个reduce内部进行排序,不保证全局有序。
没有指定分区字段,随机分区,解决数据倾斜问题
(1)设置reduce的个数 set mapreduce.job.reduce=5; (2)查看reduce的个数 set mapreduce.job.reduce
select ymd,symbol,price_close from stocks sort by symbol DESC
3. 分区排序 distribute by
结合sort by使用,指定分区字段
select ymd,symbol,price_close from stocks distribute by symbol sort by symbol;
4. Cluster By 全局排序
当distribute by 和sort by字段相同时,即分区字段和排序字段相同时,用cluster by
只能升序
select ymd,symbol,price_close from stocks cluster by symbol;
posted on 2019-01-09 10:02 happygril3 阅读(112) 评论(0) 收藏 举报