Hive 常用系统函数

1. concat函数 ：在连接字符串的时候，只要其中一个是NULL，那么将返回NULL
        hive> select concat('a','b');
        ab
        hive> select concat('a','b',null);
        NULL
2. concat_ws函数：在连接字符串的时候，只要有一个字符串不是NULL，就不会返回NULL。concat_ws函数需要指定分隔符。
    hive> select concat_ws('-','a','b');
        a-b
    hive> select concat_ws('-','a','b',null);
        a-b
    hive> select concat_ws('','a','b',null);
        ab
3. collect_set函数
    1）创建原数据表
        hive (gmall)>
        drop table if exists stud;
        create table stud (name string, area string, course string, score int);
    2）向原数据表中插入数据
        hive (gmall)>
        insert into table stud values('zhang3','bj','math',88);
        insert into table stud values('li4','bj','math',99);
        insert into table stud values('wang5','sh','chinese',92);
        insert into table stud values('zhao6','sh','chinese',54);
        insert into table stud values('tian7','bj','chinese',91);
    3）查询表中数据
        hive (gmall)> select * from stud;
        stud.name       stud.area       stud.course     stud.score
        zhang3              bj              math            88
        li4                 bj              math            99
        wang5               sh              chinese         92
        zhao6               sh              chinese         54
        tian7               bj              chinese         91
    4）把同一分组的不同行的数据聚合成一个集合
         hive (gmall)> select course, collect_set(area), avg(score) from stud group by course;
            chinese     ["sh","bj"]     79.0
            math        ["bj"]  93.5
    5）用下标可以取某一个
        hive (gmall)> select course, collect_set(area)[0], avg(score) from stud group by course;
            chinese     sh      79.0
            math        bj      93.5
4. str_to_map函数
    1）语法描述
        str_to_map(VARCHAR text, VARCHAR listDelimiter, VARCHAR keyValueDelimiter)
    2）功能描述
        使用listDelimiter将text分隔成K-V对，然后使用keyValueDelimiter分隔每个K-V对，组装成MAP返回。默认listDelimiter为（ ，），keyValueDelimiter为（=）。
    3）案例
        str_to_map('1001=2020-06-14,1002=2020-06-14',  ','  ,  '=')
         输出
        {"1001":"2020-06-14","1002":"2020-06-14"}

5. nvl函数
    1）基本语法==等同于if的用法
        NVL（表达式1，表达式2）
        如果表达式1为空值，NVL返回值为表达式2的值，否则返回表达式1的值。
        该函数的目的是把一个空值（null）转换成一个实际的值。其表达式的值可以是数字型、字符型和日期型。但是表达式1和表达式2的数据类型必须为同一个类型。
    2）案例实操
        hive (gmall)> select nvl(1,0);
            1
        hive (gmall)> select nvl(null,"hello");
            hello
6. coalesce函数
    1) 基本语法==等同于加强版的nvl，可以传递多个参数一次判断是否为空，不空则取值
        coalesce（表达式1,表达式2,表达式3,....）
    2）案例实操
        select  *
            from tmp_login
            full outer join tmp_cart on tmp_login.user_id=tmp_cart.user_id
            full outer join tmp_order on nvl(tmp_login.user_id,tmp_cart.user_id)=tmp_order.user_id
            full outer join tmp_payment on coalesce(tmp_login.user_id,tmp_cart.user_id,tmp_order.user_id) = tmp_payment.user_id
            full outer join tmp_detail on coalesce(tmp_login.user_id,tmp_cart.user_id,tmp_order.user_id,tmp_payment.user_id) = tmp_detail.user_id ;

7. 日期处理函数
    1）date_format函数（根据格式整理日期）
         hive (gmall)> select date_format('2020-06-14','yyyy-MM');
         2020-06
    2）date_add函数（加减日期）
        hive (gmall)> select date_add('2020-06-14',-1);
        2020-06-13
        hive (gmall)> select date_add('2020-06-14',1);
        2020-06-15
    3）next_day函数
        （1）取当前天的下一个周一
            hive (gmall)> select next_day('2020-06-14','MO');
            2020-06-15
            说明：星期一到星期日的英文（Monday，Tuesday、Wednesday、Thursday、Friday、Saturday、Sunday）
        （2）取当前周的周一
            hive (gmall)> select date_add(next_day('2020-06-14','MO'),-7);
            2020-06-8
    4）last_day函数（求当月最后一天日期）
        hive (gmall)> select last_day('2020-06-14');
        2020-06-30
    5) unix_timestamp 获取指定日期的时间戳
        hive (gmall)> select unix_timestamp('2021-04-15 12:45:00');
            1618490700
        hive (gmall)> select unix_timestamp('2021-04-15','yyyy-MM-dd');
            1618444800
        hive (gmall)> select unix_timestamp('20210415','yyyyMMdd');
            1618444800
    6）from_unixtime 根据时间戳获取日期
        hive (gmall)>select from_unixtime(1618444800);
            2021-04-15 00:00:00
        hive (gmall)>select from_unixtime(1618444800,'yyyy-MM-dd');
            2021-04-15
        hive (gmall)>select from_unixtime(1618444800,'yyyyMMdd');
            20210415
    7) datediff 求取两个日期之间差 分组后可以使用一进一出的函数，要求入参是常量或者分组字段
        select datediff('2020-06-15','2020-06-13')
    8).add_months 月份添加
        例子： 获取当前时间的上个月月初+月末
            select unix_timestamp('202106','yyyyMM');
            -- 1622476800

            select from_unixtime(unix_timestamp('202106','yyyyMM'),'yyyy-MM-dd HH:mm:ss');
            -- 2021-06-01 00:00:00

            select add_months(from_unixtime(unix_timestamp('202106','yyyyMM'),'yyyy-MM-dd HH:mm:ss'),-1);
            -- 2021-05-01

            SELECT last_day(from_unixtime(unix_timestamp('202106','yyyyMM'),'yyyy-MM-dd HH:mm:ss'));
            -- 2021-06-30

            select add_months(last_day(from_unixtime(unix_timestamp('202106','yyyyMM'),'yyyy-MM-dd HH:mm:ss')),-1);
            -- 2021-05-31
posted @ 2021-06-28 16:25 521pingguo1314 阅读(320) 评论(0) 收藏举报
刷新页面返回顶部
521pingguo1314

Hive 常用系统函数

公告