hive_action

w

pdf469

【不直接MR访问数据的工具   查询间接转化为MR】

https://en.wikipedia.org/wiki/Apache_Hive

Apache Hive supports analysis of large datasets stored in Hadoop's HDFS and compatible file systems such as Amazon S3 filesystem.

It provides an SQL-like query language called HiveQL[7] with schema on read and transparently converts queries to MapReduce, Apache Tez[8] and Spark jobs.

 

【数仓工具】

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data summarization, query, and analysis.[2] Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (HiveQL) into the underlying Java without the need to implement queries in the low-level Java API. Since most data warehousing applications work with SQL-based querying languages, Hive aids portability of SQL-based applications to Hadoop.[3]

 

posted @ 2017-05-07 09:37  papering  阅读(176)  评论(0编辑  收藏  举报