随笔分类 - 2.3--Hive
摘要:row_number,rank,dense_rank,percent_rank
阅读全文
摘要:val df6 = spark.sql("select gender,children,max(age),avg(age),count(age) from Affairs group by Cube(gender,children) order by 1,2") df6.show +------+--------+--------+--------+----------+ ...
阅读全文
摘要:mean均值,variance方差,stddev标准差,corr(Pearson相关系数),skewness偏度,kurtosis峰度
阅读全文
摘要:collect_set去除重复元素;collect_list不去除重复元素select gender, concat_ws(',', collect_set(children)), concat_ws(',', collect_list(children)) from Affairs group b
阅读全文
摘要:Describe Database Describe Table/View/Column hive> DESCRIBE user_info_bucketed; user_id bigint firstname string lastname string ds string # Partition
阅读全文
摘要:hive> SHOW FUNCTIONS;!!=%&*+-/===>>=^absacosadd_monthsandarrayarray_containsasciiasinassert_trueatanavgbase64betweenbincasecbrtceilceilingcoalescecoll...
阅读全文
posted @ 2015-07-25 20:45
智能先行者
摘要:Show Tables Show Partitions Show Table Properties Show Create Table Show Indexes Show Columns Show Functions Show Conf
阅读全文
posted @ 2015-07-25 10:54
智能先行者
摘要:mysql> select TBL_ID,CREATE_TIME,LAST_ACCESS_TIME,TBL_NAME,TBL_TYPE from TBLS; +--------+-------------+------------------+----------------------+---------------+ | TBL_ID | CREATE_TIME | LAST_ACCESS_...
阅读全文
posted @ 2015-07-20 23:52
智能先行者
摘要:mysql> select concat('Hadoop:','Hive:','Spark#','HBase;',TBL_TYPE,'{}',SD_ID) from TBLS; | Hadoop:Hive:Spark#HBase;MANAGED_TABLE{}6 | | Hadoop:Hive:Sp
阅读全文
摘要:The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table – only how it is read. This means that users must be ...
阅读全文
posted @ 2015-07-20 22:54
智能先行者
摘要:转自http://shiyanjun.cn/archives/588.htmlHive是基于Hadoop平台的,它提供了类似SQL一样的查询语言HQL。有了Hive,如果使用过SQL语言,并且不理解Hadoop MapReduce运行原理,也就无法通过编程来实现MR,但是你仍然可以很容易地编写出特定...
阅读全文
posted @ 2015-07-17 22:29
智能先行者
摘要:[hadoop@master hadoop]$ hive -S -e 'set -v'|grep querylog|grep -E -v 'CLASSPATH|class'hive.querylog.enable.plan.progress=truehive.querylog.location=/h...
阅读全文
posted @ 2015-07-12 20:26
智能先行者
摘要:hive> desc database extended wx_test; OK wx_test hdfs://ns1/user/hive/warehouse/wx_test.db hadoop USER {t_date=2015-06-21, creator=wx} Time taken: 0.027 seconds, Fetched: 1 row(s) hive> desc form...
阅读全文
posted @ 2015-07-12 20:17
智能先行者
摘要:Hive 是基于Hadoop 构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在Hadoop 分布式文件系统中的数据,可以将结构化的数据文件映射为一张数据库表,并提供完整的SQL查询功能,可以将SQL语句转换为MapReduce任务进行运行,通过自己的SQL 去查询分析需要的...
阅读全文
posted @ 2015-03-29 16:09
智能先行者
摘要:SELECT E.DEPARTMENT_ID DID, E.JOB_ID JOB, E.MANAGER_ID MID, SUM(E.SALARY) SUM_SAL, COUNT(E.EMPLOYEE_ID) CNT, GROUP_ID() GG FROM EMPLOYEES E WHERE E.JOB_ID IN ('S...
阅读全文