随笔档案「2019年10月」 - newtest00

大数据--sqoop数据增量导入

摘要：1、在MySQL中新建表stu，插入一些数据 2、将MySQL表的stu中的数据导入到hive中 [root@bigdata113 ~]# sqoop import --connect jdbc:mysql://bigdata113:3306/mysqlhdfs --username root -- 阅读全文

posted @ 2019-10-06 14:39 newtest00 阅读(1515) 评论(0) 推荐(0)

大数据--sqoop数据导入导出

摘要：1、在MySQL中创建表student 2、在MySQL中创建表student4 3、在hive中创建表student3 hive (default)> create table student3(id int,name string,sex string) > row format delimit 阅读全文

posted @ 2019-10-05 17:18 newtest00 阅读(348) 评论(0) 推荐(0)

大数据--hive动态分区调整

摘要：1、创建一张普通表加载数据 hive (default)> create table person(id int,name string,location string) > row format delimited fields terminated by '\t';OKTime taken: 0 阅读全文

posted @ 2019-10-04 12:24 newtest00 阅读(1275) 评论(0) 推荐(0)

大数据--hive文件存储格式

摘要：一、hive文件存储格式 Hive支持的存储数的格式主要有：TEXTFILE 、SEQUENCEFILE、ORC、PARQUET。上图左边为逻辑表，右边第一个为行式存储，第二个为列式存储。行存储的特点：查询满足条件的一整行数据的时候，列存储则需要去每个聚集的字段找到对应的每个列的值，行存储只需阅读全文

posted @ 2019-10-03 13:13 newtest00 阅读(644) 评论(0) 推荐(0)

大数据--hive分桶查询&&压缩方式

摘要：一、分桶及抽样查询 1、分桶表创建 hive (db_test)> create table stu_buck(id int,name string) > clustered by(id) > into 4 buckets > row format delimited fields terminat 阅读全文

posted @ 2019-10-03 12:59 newtest00 阅读(1169) 评论(0) 推荐(0)

newtest00

10 2019 档案

公告