随笔分类 -  2.3--Hive

摘要:row_number,rank,dense_rank,percent_rank 阅读全文
posted @ 2016-11-25 18:34 智能先行者 阅读(8075) 评论(0) 推荐(0)
摘要:val df6 = spark.sql("select gender,children,max(age),avg(age),count(age) from Affairs group by Cube(gender,children) order by 1,2") df6.show +------+--------+--------+--------+----------+ ... 阅读全文
posted @ 2016-11-25 18:23 智能先行者 阅读(3324) 评论(1) 推荐(0)
摘要:mean均值,variance方差,stddev标准差,corr(Pearson相关系数),skewness偏度,kurtosis峰度 阅读全文
posted @ 2016-11-25 17:55 智能先行者 阅读(9411) 评论(0) 推荐(0)
摘要:collect_set去除重复元素;collect_list不去除重复元素select gender, concat_ws(',', collect_set(children)), concat_ws(',', collect_list(children)) from Affairs group b 阅读全文
posted @ 2016-11-25 17:19 智能先行者 阅读(14452) 评论(0) 推荐(2)
摘要:Describe Database Describe Table/View/Column hive> DESCRIBE user_info_bucketed; user_id bigint firstname string lastname string ds string # Partition 阅读全文
posted @ 2015-07-25 21:12 智能先行者 阅读(4542) 评论(0) 推荐(1)
摘要:hive> SHOW FUNCTIONS;!!=%&*+-/===>>=^absacosadd_monthsandarrayarray_containsasciiasinassert_trueatanavgbase64betweenbincasecbrtceilceilingcoalescecoll... 阅读全文
posted @ 2015-07-25 20:45 智能先行者
摘要:Show Tables Show Partitions Show Table Properties Show Create Table Show Indexes Show Columns Show Functions Show Conf 阅读全文
posted @ 2015-07-25 10:54 智能先行者
摘要:mysql> select TBL_ID,CREATE_TIME,LAST_ACCESS_TIME,TBL_NAME,TBL_TYPE from TBLS; +--------+-------------+------------------+----------------------+---------------+ | TBL_ID | CREATE_TIME | LAST_ACCESS_... 阅读全文
posted @ 2015-07-20 23:52 智能先行者
摘要:mysql> select concat('Hadoop:','Hive:','Spark#','HBase;',TBL_TYPE,'{}',SD_ID) from TBLS; | Hadoop:Hive:Spark#HBase;MANAGED_TABLE{}6 | | Hadoop:Hive:Sp 阅读全文
posted @ 2015-07-20 23:37 智能先行者 阅读(496) 评论(0) 推荐(0)
摘要:The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table – only how it is read. This means that users must be ... 阅读全文
posted @ 2015-07-20 22:54 智能先行者
摘要:转自http://shiyanjun.cn/archives/588.htmlHive是基于Hadoop平台的,它提供了类似SQL一样的查询语言HQL。有了Hive,如果使用过SQL语言,并且不理解Hadoop MapReduce运行原理,也就无法通过编程来实现MR,但是你仍然可以很容易地编写出特定... 阅读全文
posted @ 2015-07-17 22:29 智能先行者
摘要:[hadoop@master hadoop]$ hive -S -e 'set -v'|grep querylog|grep -E -v 'CLASSPATH|class'hive.querylog.enable.plan.progress=truehive.querylog.location=/h... 阅读全文
posted @ 2015-07-12 20:26 智能先行者
摘要:hive> desc database extended wx_test; OK wx_test hdfs://ns1/user/hive/warehouse/wx_test.db hadoop USER {t_date=2015-06-21, creator=wx} Time taken: 0.027 seconds, Fetched: 1 row(s) hive> desc form... 阅读全文
posted @ 2015-07-12 20:17 智能先行者
摘要:LATERAL VIEW explode 阅读全文
posted @ 2015-06-24 22:50 智能先行者
摘要:Hive 是基于Hadoop 构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在Hadoop 分布式文件系统中的数据,可以将结构化的数据文件映射为一张数据库表,并提供完整的SQL查询功能,可以将SQL语句转换为MapReduce任务进行运行,通过自己的SQL 去查询分析需要的... 阅读全文
posted @ 2015-03-29 16:09 智能先行者
摘要:SELECT E.DEPARTMENT_ID DID, E.JOB_ID JOB, E.MANAGER_ID MID, SUM(E.SALARY) SUM_SAL, COUNT(E.EMPLOYEE_ID) CNT, GROUP_ID() GG FROM EMPLOYEES E WHERE E.JOB_ID IN ('S... 阅读全文
posted @ 2014-12-21 15:59 智能先行者 阅读(1545) 评论(0) 推荐(0)