马士兵hadoop2.7.3_hive入门

Hive入门
解压Hive，到/usr/local目录，将解压后的目录名mv为hive
设定环境变量HADOOP_HOME，HIVE_HOME，将bin目录加入到PATH中
1. cd /usr/local/hive/conf
2. cp hive-default.xml.template hive-site.xml
3. 修改hive.metastore.schema.verification，设定为false
4. 创建/usr/local/hive/tmp目录，替换${system:java.io.tmpdir}为该目录
5. 替换${system:user.name}为root
schematool -initSchema -dbType derby
会在当前目录下简历metastore_db的数据库。
注意！！！下次执行hive时应该还在同一目录，默认到当前目录下寻找metastore。
遇到问题，把metastore_db删掉，重新执行命令
实际工作环境中，经常使用mysql作为metastore的数据
启动hive
观察hadoop fs -ls /tmp/hive中目录的创建

show databases;
use default;
create table doc(line string);
show tables;
desc doc;
select * from doc;
drop table doc;

观察hadoop fs -ls /user
启动yarn

load data inpath '/wcinput' overwrite into table doc;
select * from doc;
select split(line, ' ') from doc;
select explode(split(line, ' ')) from doc;
select word, count(1) as count from (select explode(split(line, ' ')) as word from doc) w group by word;
select word, count(1) as count from (select explode(split(line, ' ')) as word from doc) w group by word order by word;
create table word_counts as select word, count(1) as count from (select explode(split(line, ' ')) as word from doc) w group by word order by word;
select * from word_counts;

dfs -ls /user/hive/...

使用sougou搜索日志做实验
将日志文件上传的hdfs系统，启动hive

create table sougou (qtime string, qid string, qword string, url string) row format delimited fields terminated by ',';
load data inpath '/sougou.dic' into table sougou;
select count(*) from sougou;
create table sougou_results as select keyword, count(1) as count from (select qword as keyword from sougou) t group by keyword order by count desc;
select * from sougou_results limit 10;

posted @ 2017-04-05 15:46 Mr.xiaobai丶阅读(1013) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

Mr.xiaobai丶

马士兵hadoop2.7.3_hive入门

公告