BenjaminYang In solitude, where we are least alone

hive学习(二) hive操作

 

1.创建库

create database test;

2.删除库

drop database test;
 

3.建表

完整ddl建表语法规则
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name    -- (Note: TEMPORARY available in Hive 0.14.0 and later)
  [(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
  [COMMENT table_comment]
  [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
  [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]
  [SKEWED BY (col_name, col_name, ...)                  -- (Note: Available in Hive 0.10.0 and later)]
     ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
     [STORED AS DIRECTORIES]
  [
   [ROW FORMAT row_format]
   [STORED AS file_format]
     | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  -- (Note: Available in Hive 0.6.0 and later)
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]   -- (Note: Available in Hive 0.6.0 and later)
  [AS select_statement];   -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)

4.hive的数据类型

: primitive_type
  | array_type
  | map_type
  | struct_type
:primitive_type
  |TINYINT
  | SMALLINT
  | INT
  | BIGINT
  | BOOLEAN
  | FLOAT
  | DOUBLE
  | STRING

5.建表例子:

如下  文件data
对应字段:   id,姓名,爱好,住址
1,小明1,lol-book-movie,beijing:changping-shanghai:pudong
2,小明2,lol-book-movie,beijing:changping-shanghai:pudong
3,小明3,lol-book-movie,beijing:changping-shanghai:pudong
4,小明4,lol-book-movie,beijing:changping-shanghai:pudong
5,小明5,lol-movie,beijing:changping-shanghai:pudong
6,小明6,lol-book-movie,beijing:changping-shanghai:pudong
7,小明7,lol-book,beijing:changping-shanghai:pudong
8,小明8,lol-book,beijing:changping-shanghai:pudong
9,小明9,lol-book-movie,beijing:changping-shanghai:pudong
id int , 姓名 string,爱好  数组类型,住址 map类型
 
 

5.1创建语句分析

create table psn (
id int,
name string,
hobby array<string>,
address map<string,string>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '-'
MAP KEYS TERMINATED BY ':'
LINES TERMINATED BY '\n';
通过hive建表时不但定义了字段的类型,还定义了字段与字段之间的分隔符,数据类型之间的分隔符,每行数据之间的分隔符。
 
 

5.2执行创建

hive> create table psn (
    > id int,
    > name string,
    > hobby array<string>,
    > address map<string,string>
    > )
    > ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY ','
    > COLLECTION ITEMS TERMINATED BY '-'
    > MAP KEYS TERMINATED BY ':'
    > LINES TERMINATED BY '\n';
OK
Time taken: 0.758 seconds
 

5.3查看结构

hive> desc psn;
OK
id                      int                                         
name                    string                                      
hobby                   array<string>                               
address                 map<string,string>                          
Time taken: 0.267 seconds, Fetched: 4 row(s)
 
#详细查看
hive> desc formatted psn;
 

5.4插入数据

 
说插入有点不合适,应该是加载文件(data.txt)中的数据

5.5语法:

LOAD DATA [LOCAL] INPATH 'path' INTO TABLE psn; 
 

5.6执行加载

hive> load data local inpath '/root/data' into table psn;
Loading data to table default.psn
OK
Time taken: 3.591 seconds
 
发现使用这种方法插入数据简直快的飞起啊。
 

5.7执行查询

hive> select * from psn;
OK
1    小明1    ["lol","book","movie"]    {"beijing":"changping","shanghai":"pudong"}
2    小明2    ["lol","book","movie"]    {"beijing":"changping","shanghai":"pudong"}
3    小明3    ["lol","book","movie"]    {"beijing":"changping","shanghai":"pudong"}
4    小明4    ["lol","book","movie"]    {"beijing":"changping","shanghai":"pudong"}
5    小明5    ["lol","movie"]    {"beijing":"changping","shanghai":"pudong"}
6    小明6    ["lol","book","movie"]    {"beijing":"changping","shanghai":"pudong"}
7    小明7    ["lol","book"]    {"beijing":"changping","shanghai":"pudong"}
8    小明8    ["lol","book"]    {"beijing":"changping","shanghai":"pudong"}
9    小明9    ["lol","book","movie"]    {"beijing":"changping","shanghai":"pudong"}
Time taken: 0.235 seconds, Fetched: 9 row(s)
 

5.8注意,

分隔符的格式如果不匹配它不会报错,而是插入null的空数据。
 

6.hive内部表和外部表

6.1hive内部表语法

CREATE  TABLE [IF NOT EXISTS] table_name
删除表时,元数据与表结构都会被删除

6.2hive外部表语法

CREATE EXTERNAL TABLE [IF NOT EXISTS] table_name LOCATION hdfs_path
删除表时,只删除表结构,不删除元数据。

6.3创建外部表

create external table psn (
id int,
name string,
hobby array<string>,
address map<string,string>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '-'
MAP KEYS TERMINATED BY ':'
LINES TERMINATED BY '\n'
location '/usr/';
创建前 hdfs下的目录 没有 usr目录

6.4执行创建外部表

hive> create external table psn (
    > id int,
    > name string,
    > hobby array<string>,
    > address map<string,string>
    > )
    > ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY ','
    > COLLECTION ITEMS TERMINATED BY '-'
    > MAP KEYS TERMINATED BY ':'
    > LINES TERMINATED BY '\n'
    > location '/usr/';
OK
Time taken: 0.188 seconds
再次查看网页hdfs下的目录
 
外部表执行删除只会删除表结构,不删除元数据。
 
刷新网页usr目录还在

6.5内部表和外部表的区别:

1.创建时需要制定目录
2.外部表执行删除只会删除表结构,不删除元数据。内部表回删除表结构和删除元数据。
posted @ 2019-01-08 17:19  benjamin杨  阅读(318)  评论(0编辑  收藏  举报