hive-HS2-概述

HiveServer2

转自:https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Overview

HiveServer2服务可以让客户端执行hive命令,支持多线程并发,JDBC、ODBC、Beeline、TCP、HTTP方式
核心是Thrift服务,分为4个层Transport、Protocol、Processor、Server
https://thrift.apache.org/docs/concepts
下面描述了4个层的使用情况
Server
TThreadPoolServer用于TCP模式
Jetty用于HTTP模式
TThreadPoolServer每一个TCP连接都申请一个worker,连接是idle仍然关联着,以后应该升级到TThreadedSelectorServer提升并发性能;
这里是性能对比:
https://github.com/m1ch1/mapkeeper/wiki/Thrift-Java-Servers-Compared

Transport
在client和server之间有代理时,需要设置为http模式,比如负载均衡或者安全考虑;
可以通过Configuration Properties配置
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.transport.mode

Protocol
协议是用来做序列化和反序列化的,HS2现在使用TBinaryProtocol作为Thrift协议序列化;
以后会加入更多的序列化方式,比如:TCompactProtocol,这需要更多的性能评估;

Processor
处理request的应用逻辑,比如ThriftCLIService.ExecuteStatement()方法去编译和执行hive query;

HS2的依赖

Metastore
Hadoop集群
这里描述了HS2和依赖的交互
https://cwiki.apache.org/confluence/display/Hive/Design#Design-HiveArchitecture

JDBC Client

JDBC推荐使用,HUE使用Thrift client;
API调用顺序如下:

  • client创建HiveConnection、调用OpenSession API获取SessionHandle,这个session从server端创建;
  • HiveStatement执行,thrift client调用ExecuteStatement,在api调用中SessionHandle信息和查询信息一起发送;
  • HS2 server接收信息并且询问driver去查询和执行,driver 运行一个后台job,和hadoop交互,然后返回一个响应给client;这是异步的;响应带一个server创建的OperationHandle
  • client使用OperationHandle和HS2交互,执行查询操作;

Source Code Description

源代码
Server Side
Thrift IDL file for TCLIService
https://github.com/apache/hive/blob/master/service-rpc/if/TCLIService.thrift
ThriftCLIService subclassed
org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService
org.apache.hive.service.cli.session.HiveSessionImpl
org.apache.hive.service.cli.operation.Operation
org.apache.hive.service.auth.HiveAuthFactory

Client Side
org.apache.hive.jdbc.HiveConnection
org.apache.hive.jdbc.HiveStatement
org.apache.hive.jdbc.HiveDriver

Interaction between Client and Server
org.apache.hive.service.cli.SessionHandle
org.apache.hive.service.cli.OperationHandle

Resources

如何设置H2:
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2
H2客户端:
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
HS2 web ui:
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-WebUIforHiveServer2
Metrics
https://cwiki.apache.org/confluence/display/Hive/Hive+Metrics
Cloudera blog on HS2:
http://blog.cloudera.com/blog/2013/07/how-hiveserver2-brings-security-and-concurrency-to-apache-hive/

posted @ 2017-01-06 10:45  zhangshihai1232  阅读(1209)  评论(0)    收藏  举报