SparkThriftServer 源码分析

版本

spark 2.2.0

起点

Spark thrift server复用了Hive Server2的源码，插入了自己的覆盖的方法。
整个过程里面需要穿插着Hive和Spark的源码。
整个流程是从Beeline开始的，Beeline属于是Hive的源码，下面开始进入流程：

客户端——Beeline

jar包：hive-beeline-1.2.1.spark2.jar
SparkJDBC通过Beeline作为客户端，发送请求，与Spark服务进行交互
Beeline的入口：

//位置:src\java\org\apache\hive\beeline\BeeLine.java
  public static void main(String[] args) throws IOException {
    mainWithInputRedirection(args, null);
}
  public static void mainWithInputRedirection(String[] args, InputStream inputStream)
    throws IOException
  {
    BeeLine beeLine = new BeeLine();
    int status = beeLine.begin(args, inputStream);
    if (!Boolean.getBoolean("beeline.system.exit")) {
      System.exit(status);
    }
  }

调用beeLine.begin：解析传入的参数，调用execute方法

  public int begin(String[] args, InputStream inputStream)
    throws IOException
  {
    try
    {
      getOpts().load();
    }
    catch (Exception e) {}
    try
    {
      int code = initArgs(args);
      int i;
      if (code != 0) {
        return code;
      }
      if (getOpts().getScriptFile() != null) {
        return executeFile(getOpts().getScriptFile());
      }
      try
      {
        info(getApplicationTitle());
      }
      catch (Exception e) {}
      ConsoleReader reader = getConsoleReader(inputStream);
      return execute(reader, false);
    }
    finally
    {
      close();
    }
  }
  

String line = getOpts().isSilent() ? reader.readLine(null, Character.valueOf('\000')) : reader.readLine(getPrompt());
if ((!dispatch(line)) && (exitOnError)) {
  return 2;
}

execute读取输入流，调用dispath方法
dispatch：处理无效字符及help请求，
- 若以！开始，则创建CommandHandler处理；
- 否则调用Commands的sql函数处理sql命令：org.apache.hive.beeline.DatabaseConnection的getConnection创建连接
  this.commands.sql(line, getOpts().getEntireLineAsCommand())；
执行SQL: Commands.execute中调用JDBC标准接口，stmnt.execute(sql);具体流程参考下节jdbc中HiveStatement的execute执行
处理结果集：Commands.execute中处理结果集，代码如下：

    if (hasResults)
              {
                do
                {
                  ResultSet rs = stmnt.getResultSet();
                  try
                  {
                    int count = this.beeLine.print(rs);
                    long end = System.currentTimeMillis();
                    
                    this.beeLine.info(this.beeLine.loc("rows-selected", count) + " " + this.beeLine.locElapsedTime(end - start));
                  }
                  finally
                  {
                    if (logThread != null)
                    {
                      logThread.join(10000L);
                      showRemainingLogsIfAny(stmnt);
                      logThread = null;
                    }
                    rs.close();
                  }
                } while (BeeLine.getMoreResults(stmnt));
              }
              else
              {
                int count = stmnt.getUpdateCount();
                long end = System.currentTimeMillis();
                this.beeLine.info(this.beeLine.loc("rows-affected", count) + " " + this.beeLine.locElapsedTime(end - start));
              }

服务端

Spark JDBC基于thrift框架，实现RPC服务，并复用了hiveServer中大量的代码
hive-jdbc通过封装rpc client请求，结果集处理，实现了JDBC标准接口
SparkThrift中实现了实际计算的流程

Hive-jdbc

TCLIService.Iface客户端请求

取消请求

TCancelOperationResp cancelResp = this.client.CancelOperation(cancelReq);

关闭请求

    TCloseOperationReq closeReq = new TCloseOperationReq(this.stmtHandle);
    TCloseOperationResp closeResp = this.client.CloseOperation(closeReq);

执行查询

TExecuteStatementResp execResp = this.client.ExecuteStatement(execReq);

查看执行状态

statusResp = this.client.GetOperationStatus(statusReq);

流程

jar包：hive-jdbc-1.2.1.spark2.jar
hive jdbc基于thrift框架（Facebook开源的RPC框架）实现，client（TCLIService.Iface client）RPC调用的客户端，远程调用HiveServer里面的TCLIService服务
分析HiveStatement的execute方法
- 执行查询：TExecuteStatementResp execResp = this.client.ExecuteStatement(execReq);
- 检查operationComplete，在操作未结束时，持续调用client.GetOperationStatus，获取服务端执行状态
- 解析状态码，若为2，则执行结束。接下来判断是否存在结果集，若存在则解析结果集
- 解析结果集
```
this.resultSet = new HiveQueryResultSet
    .Builder(this)
    .setClient(this.client)
    .setSessionHandle(this.sessHandle)
    .setStmtHandle(this.stmtHandle)
    .setMaxRows(this.maxRows)
    .setFetchSize(this.fetchSize)
    .setScrollable(this.isScrollableResultset)
    .setTransportLock(this.transportLock)
    .build();
```
execute代码

public Boolean execute(String sql)
    throws SQLException
  {
	checkConnection("execute");
	closeClientOperation();
	initFlags();
	TExecuteStatementReq execReq = new TExecuteStatementReq(this.sessHandle, sql);
	execReq.setRunAsync(true);
	execReq.setConfOverlay(this.sessConf);
	this.transportLock.lock();
	try
	    {
		TExecuteStatementResp execResp = this.client.ExecuteStatement(execReq);
		Utils.verifySuccessWithInfo(execResp.getStatus());
		this.stmtHandle = execResp.getOperationHandle();
		this.isExecuteStatementFailed = false;
	}
	catch (SQLException eS)
	    {
		this.isExecuteStatementFailed = true;
		throw eS;
	}
	catch (Exception ex)
	    {
		this.isExecuteStatementFailed = true;
		throw new SQLException(ex.toString(), "08S01", ex);
	}
	finally
	    {
		this.transportLock.unlock();
	}
	TGetOperationStatusReq statusReq = new TGetOperationStatusReq(this.stmtHandle);
	Boolean operationComplete = false;
	while (!operationComplete) {
		try
		      {
			this.transportLock.lock();
			TGetOperationStatusResp statusResp;
			try
			        {
				statusResp = this.client.GetOperationStatus(statusReq);
			}
			finally
			        {
				this.transportLock.unlock();
			}
			Utils.verifySuccessWithInfo(statusResp.getStatus());
			if (statusResp.isSetOperationState()) {
				switch (1.$SwitchMap$org$apache$hive$service$cli$thrift$TOperationState[statusResp.getOperationState().ordinal()])
				          {
					case 1: 
					          case 2: 
					            operationComplete = true;
					break;
					case 3: 
					            throw new SQLException("Query was cancelled", "01000");
					case 4: 
					            throw new SQLException(statusResp.getErrorMessage(), statusResp.getSqlState(), statusResp.getErrorCode());
					case 5: 
					            throw new SQLException("Unknown query", "HY000");
				}
			}
		}
		catch (SQLException e)
		      {
			this.isLogBeingGenerated = false;
			throw e;
		}
		catch (Exception e)
		      {
			this.isLogBeingGenerated = false;
			throw new SQLException(e.toString(), "08S01", e);
		}
	}
	this.isLogBeingGenerated = false;
	if (!this.stmtHandle.isHasResultSet()) {
		return false;
	}
	this.resultSet = new HiveQueryResultSet.Builder(this).setClient(this.client).setSessionHandle(this.sessHandle).setStmtHandle(this.stmtHandle).setMaxRows(this.maxRows).setFetchSize(this.fetchSize).setScrollable(this.isScrollableResultset).setTransportLock(this.transportLock).build();
	return true;
}

SparkThrift

SparkThrift服务中，启动了两个service：SparkSQLCLIService和ThriftHttpCLIService（ThriftBinaryCLIService）
ThriftHttpCLIService:是RPC调用的通道
SparkSQLCLIService：是用来对客户提出的请求进行服务的服务

主函数HiveThriftServer2

调用 SparkSQLEnv.init()，创建Sparksession、sparkConf等

init

创建SparkSQLCLIService，并通过反射，设置cliService，并通过addService加入到父类的serviceList中，然后调用initCompositeService

  public void init(HiveConf hiveConf)
  {
    SparkSQLCLIService sparkSqlCliService = new SparkSQLCLIService(this, this.sqlContext);
    ReflectionUtils..MODULE$.setSuperField(this, "cliService", sparkSqlCliService);
    addService(sparkSqlCliService);
    
    ThriftCLIService thriftCliService = isHTTPTransportMode(hiveConf) ? 
      new ThriftHttpCLIService(sparkSqlCliService) : 
      
      new ThriftBinaryCLIService(sparkSqlCliService);
    

    ReflectionUtils..MODULE$.setSuperField(this, "thriftCLIService", thriftCliService);
    addService(thriftCliService);
    initCompositeService(hiveConf);
  }

initCompositeService，该函数封装了ReflectedCompositeService的initCompositeService

private[thriftserver] trait ReflectedCompositeService { this: AbstractService =>
  def initCompositeService(hiveConf: HiveConf) {
    // Emulating `CompositeService.init(hiveConf)`
    val serviceList = getAncestorField[JList[Service]](this, 2, "serviceList")
    serviceList.asScala.foreach(_.init(hiveConf))

    // Emulating `AbstractService.init(hiveConf)`
    invoke(classOf[AbstractService], this, "ensureCurrentState", classOf[STATE] -> STATE.NOTINITED)
    setAncestorField(this, 3, "hiveConf", hiveConf)
    invoke(classOf[AbstractService], this, "changeState", classOf[STATE] -> STATE.INITED)
    getAncestorField[Log](this, 3, "LOG").info(s"Service: $getName is inited.")
  }
}

通过反射，拿到祖先类的serviceList成员变量，对这个List里面的每个成员调用了一次init方法
由于已经把ThriftHttpCLIService和SparkSQLCLIService放入到这个List里面了，因此这里会调用到它们的init方法。

给sc加了个HiveThriftServer2Listener，它也是继承自SparkListener，它会记录每个sql statement的时间、状态等信息。

    listener = new HiveThriftServer2Listener(server, SparkSQLEnv.sqlContext.conf)
    SparkSQLEnv.sparkContext.addSparkListener(listener)
    
    HiveThriftServer2Listener实现的接口包括：
    onJobStart
    onSessionCreated
    onSessionClosed
    onStatementStart
    onStatementParsed
    onStatementError
    onStatementFinish
    将这些信息主要记录在了sessionList和executionList中

thrift server启动后会向spark ui注册一个TAB：“JDBC/ODBC Server”。


      uiTab = if (SparkSQLEnv.sparkContext.getConf.getBoolean("spark.ui.enabled", true)) {
        Some(new ThriftServerTab(SparkSQLEnv.sparkContext))
      } else {
        None
      }

ThriftHttpCLIService/ThriftBinaryCLIService

封装SparkSQLCLIService
对外提供http或者TCP服务

ThriftHttpCLIService

由于Spark里面没有实现这个类，而是完全复用的Hive的源码，这里直接看一下Hive中的ThriftHttpCLIService的start方法，由于ThriftHttpCLIService没有实现start方法，继续跟进到它的父类里面：

//位置：hive/hive-1.1.0-cdh5.7.0/service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java
  @Override
  public synchronized void start() {
    super.start();
    if (!isStarted && !isEmbedded) {
      new Thread(this).start();
      isStarted = true;
    }
  }

这个方法很简单，首先是调用父类的start方法，这里的父类也是AbstractService，因此，也是把服务的状态从INITED重置为STARTED。然后，启动了包裹自己的一个线程，这个线程会调用ThriftHttpCLIService类里面的run方法

 @Override
  public void run() {
    try {
      ...
      // HTTP Server
      httpServer = new org.eclipse.jetty.server.Server(threadPool);

      TProcessor processor = new TCLIService.Processor<Iface>(this);
      TProtocolFactory protocolFactory = new TBinaryProtocol.Factory();
      
      // 配置servlet，主要代码逻辑在processor
      TServlet thriftHttpServlet = new ThriftHttpServlet(processor, protocolFactory, authType,
          serviceUGI, httpUGI);
      context.addServlet(new ServletHolder(thriftHttpServlet), httpPath);

      httpServer.join();
    } catch (Throwable t) {
      LOG.fatal(
          "Error starting HiveServer2: could not start "
              + ThriftHttpCLIService.class.getSimpleName(), t);
      System.exit(-1);
    }
  }

这个方法是通过jetty启动了一个http服务，然后配置ThriftHttpServlet来处理用户的请求。
注意这段代码中的processor对象，它是这个jetty服务的最主要的处理逻辑，下面跟进一下：

    protected Processor(I iface, Map<String,  org.apache.thrift.ProcessFunction<I, ? extends  org.apache.thrift.TBase>> processMap) {
      super(iface, getProcessMap(processMap));
    }

这里的getProcessMap是用来处理各种请求的函数：

    private static <I extends Iface> Map<String,  org.apache.thrift.ProcessFunction<I, ? extends  org.apache.thrift.TBase>> getProcessMap(Map<String,  org.apache.thrift.ProcessFunction<I, ? extends  org.apache.thrift.TBase>> processMap) {
      processMap.put("OpenSession", new OpenSession());
      processMap.put("CloseSession", new CloseSession());
      processMap.put("GetInfo", new GetInfo());
      processMap.put("ExecuteStatement", new ExecuteStatement());
      processMap.put("GetTypeInfo", new GetTypeInfo());
      processMap.put("GetCatalogs", new GetCatalogs());
      processMap.put("GetSchemas", new GetSchemas());
      processMap.put("GetTables", new GetTables());
      processMap.put("GetTableTypes", new GetTableTypes());
      processMap.put("GetColumns", new GetColumns());
      processMap.put("GetFunctions", new GetFunctions());
      processMap.put("GetOperationStatus", new GetOperationStatus());
      processMap.put("CancelOperation", new CancelOperation());
      processMap.put("CloseOperation", new CloseOperation());
      processMap.put("GetResultSetMetadata", new GetResultSetMetadata());
      processMap.put("FetchResults", new FetchResults());
      processMap.put("GetDelegationToken", new GetDelegationToken());
      processMap.put("CancelDelegationToken", new CancelDelegationToken());
      processMap.put("RenewDelegationToken", new RenewDelegationToken());
      return processMap;
    }

查看ExecuteStatement()接口，该接口的实现在CLIService中

  @Override
  public OperationHandle executeStatement(SessionHandle sessionHandle, String statement,
      Map<String, String> confOverlay)
          throws HiveSQLException {
    OperationHandle opHandle = sessionManager.getSession(sessionHandle)
        .executeStatement(statement, confOverlay);
    LOG.debug(sessionHandle + ": executeStatement()");
    return opHandle;
  }

  @Override
  public OperationHandle executeStatementAsync(SessionHandle sessionHandle, String statement,
      Map<String, String> confOverlay) throws HiveSQLException {
    OperationHandle opHandle = sessionManager.getSession(sessionHandle)
        .executeStatementAsync(statement, confOverlay);
    LOG.debug(sessionHandle + ": executeStatementAsync()");
    return opHandle;
  }

可以看出，要想改变执行层，需要修改sessionManager、OperationHandle

小结

spark想借用hivejdbc服务，需要做以下几件事：

重写OperationHandle，将执行操作交给spark sql来做
重写sessionManager，保证获取的OperationHandle是spark OperationHandle
重写SparkSQLCLIService，保证spark相关配置能传入执行类；保证重写的sparkSqlSessionManager加入到serviceList中

SparkSQLCLIService

继承CLIService

SparkSQLCLIService

init代码

  override def init(hiveConf: HiveConf) {
    setSuperField(this, "hiveConf", hiveConf)

    val sparkSqlSessionManager = new SparkSQLSessionManager(hiveServer, sqlContext)
    setSuperField(this, "sessionManager", sparkSqlSessionManager)
    addService(sparkSqlSessionManager)
    var sparkServiceUGI: UserGroupInformation = null

    if (UserGroupInformation.isSecurityEnabled) {
      try {
        HiveAuthFactory.loginFromKeytab(hiveConf)
        sparkServiceUGI = Utils.getUGI()
        setSuperField(this, "serviceUGI", sparkServiceUGI)
      } catch {
        case e @ (_: IOException | _: LoginException) =>
          throw new ServiceException("Unable to login to kerberos with given principal/keytab", e)
      }
    }

    initCompositeService(hiveConf)
  }

这里会创建一个SparkSQLSessionManager的实例，然后把这个实例放入到父类中，添加上这个服务。然后再通过initCompositeService方法，来调用到SparkSQLSessionManager实例的init方法

SparkSQLSessionManager

SessionManager初始化时，会注册SparkSQLOperationManager，它用来：
- 管理会话和hiveContext的关系，根据会话可以找到其hc；
- 代替hive的OperationManager管理句柄和operation的关系。
thriftserver可以配置为单会话，即所有beeline共享一个hiveContext，也可以配置为新起一个会话，每个会话独享HiveContext，这样可以获得独立的UDF/UDAF，临时表，会话状态等。默认是新起会话。
新建会话的时候，会将会话和hiveContext的对应关系添加到OperationManager的sessionToContexts这个Map
```
sparkSqlOperationManager.sessionToContexts.put(sessionHandle, ctx)
```

init

创建一个SparkSQLOperationManager对象，然后通过initCompositeService来调用SparkSQLOperationManager对象的init方法，由于这个对象并没有重写这个方法，因此需要追到它的父类OperationManager：

  private lazy val sparkSqlOperationManager = new SparkSQLOperationManager()

  override def init(hiveConf: HiveConf) {
    setSuperField(this, "hiveConf", hiveConf)

    // Create operation log root directory, if operation logging is enabled
    if (hiveConf.getBoolVar(ConfVars.HIVE_SERVER2_LOGGING_OPERATION_ENABLED)) {
      invoke(classOf[SessionManager], this, "initOperationLogRootDir")
    }

    val backgroundPoolSize = hiveConf.getIntVar(ConfVars.HIVE_SERVER2_ASYNC_EXEC_THREADS)
    setSuperField(this, "backgroundOperationPool", Executors.newFixedThreadPool(backgroundPoolSize))
    getAncestorField[Log](this, 3, "LOG").info(
      s"HiveServer2: Async execution pool size $backgroundPoolSize")

    setSuperField(this, "operationManager", sparkSqlOperationManager)
    addService(sparkSqlOperationManager)

    initCompositeService(hiveConf)
  }

OperationManager.init

 public synchronized void init(HiveConf hiveConf)
  {
    if (hiveConf.getBoolVar(HiveConf.ConfVars.HIVE_SERVER2_LOGGING_OPERATION_ENABLED)) {
      initOperationLogCapture(hiveConf.getVar(HiveConf.ConfVars.HIVE_SERVER2_LOGGING_OPERATION_LEVEL));
    } else {
      this.LOG.debug("Operation level logging is turned off");
    }
    super.init(hiveConf);
  }

execute

CLIService根据句柄找到Session并执行statement

  public OperationHandle executeStatement(SessionHandle sessionHandle, String statement,
  Map<String, String> confOverlay)
      throws HiveSQLException {
    OperationHandle opHandle = sessionManager.getSession(sessionHandle)
        .executeStatement(statement, confOverlay);
    LOG.debug(sessionHandle + ": executeStatement()");
    return opHandle;
  }

每次statement执行，都是新申请的operation，都会加到OperationManager去管理。newExecuteStatementOperation被SparkSQLOperationManager.newExecuteStatementOperation覆盖了，创建的operation实际是SparkExecuteStatementOperation。

  @Override
public OperationHandle executeStatement(String statement, Map<String, String> confOverlay)
  throws HiveSQLException {
return executeStatementInternal(statement, confOverlay, false);
}

private OperationHandle executeStatementInternal(String statement, Map<String, String> confOverlay,
  boolean runAsync)
      throws HiveSQLException {
acquire(true);

OperationManager operationManager = getOperationManager();
ExecuteStatementOperation operation = operationManager
    .newExecuteStatementOperation(getSession(), statement, confOverlay, runAsync);
OperationHandle opHandle = operation.getHandle();
try {
// 调用operation的run函数
  operation.run();
  opHandleSet.add(opHandle);
  return opHandle;
} catch (HiveSQLException e) {
  // Refering to SQLOperation.java,there is no chance that a HiveSQLException throws and the asyn
  // background operation submits to thread pool successfully at the same time. So, Cleanup
  // opHandle directly when got HiveSQLException
  operationManager.closeOperation(opHandle);
  throw e;
} finally {
  release(true);
}


// Operation 中run函数的实现
  public void run() throws HiveSQLException {
beforeRun();
try {
  runInternal();
} finally {
  afterRun();
}
}

整个的过程与Hive原生流程基本是一致的。在Hive的原生流程中，会在HiveServer2里面创建一个CLIService服务，这个跟Spark中的SparkSQLCLIService对应；然后在CLIService服务里面会创建一个SessionManager服务，这个跟Spark中的SparkSQLSessionManager对应；再之后，在SessionManager里面会创建一个OperationManager服务，这个跟Spark中的SparkSQLOperationManager对应。

SparkExecuteStatementOperation

如前所述，一个Operation分三步：beforeRun、runInternal、afterRun，beforeRun和afterRun用来记录日志，直接看runInternal。
runInternal默认为异步，即后台执行，SparkExecuteStatementOperation创建一个Runable对象，并将其提交到里面backgroundOperationPool，新起一个线程来做excute。
excute中最主要的是sqlContext.sql(statement)

private def execute(): Unit = {
    statementId = UUID.randomUUID().toString
    logInfo(s"Running query '$statement' with $statementId")
    setState(OperationState.RUNNING)

  result = sqlContext.sql(statement)
  logDebug(result.queryExecution.toString())
  result.queryExecution.logical match {
    case SetCommand(Some((SQLConf.THRIFTSERVER_POOL.key, Some(value)))) =>
      sessionToActivePool.put(parentSession.getSessionHandle, value)
      logInfo(s"Setting spark.scheduler.pool=$value for future statements in this session.")
    case _ =>
  }
  HiveThriftServer2.listener.onStatementParsed(statementId, result.queryExecution.toString())
  iter = {
    if (sqlContext.getConf(SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT.key).toBoolean) {
      resultList = None
      result.toLocalIterator.asScala
    } else {
      resultList = Some(result.collect())
      resultList.get.iterator
    }
  }
  dataTypes = result.queryExecution.analyzed.output.map(_.dataType).toArray

    setState(OperationState.FINISHED)
    HiveThriftServer2.listener.onStatementFinish(statementId)
  }

成员变量

// 结果集DataFrame
private var result: DataFrame = _
// 结果集列表
private var resultList: Option[Array[SparkRow]] = _
结果集迭代器
private var iter: Iterator[SparkRow] = _
数据类型
private var dataTypes: Array[DataType] = _
语句ID，例如 “61146141-2a0a-41ec-bce4-dd691a0fa63c”
private var statementId: String = _


结果集schema，在getResultSetMetadata接口调用时，使用
  private lazy val resultSchema: TableSchema = {
    if (result == null || result.schema.isEmpty) {
      new TableSchema(Arrays.asList(new FieldSchema("Result", "string", "")))
    } else {
      logInfo(s"Result Schema: ${result.schema}")
      SparkExecuteStatementOperation.getTableSchema(result.schema)
    }
  }

execute

执行查询： sqlContext.sql(statement)
保存结果集private var result: DataFrame
保存结果集迭代器： private var iter: Iterator[SparkRow] = _
结果集schema： dataTypes = result.queryExecution.analyzed.output.map(_.dataType).toArray

getNextRowSet

def getNextRowSet(order: FetchOrientation, maxRowsL: Long)
order：是否从开始取；maxRowsL：最多取多少行；
根据数据结构，构建rowset：

val resultRowSet: RowSet = RowSetFactory.create(getResultSetSchema, getProtocolVersion)

解析结果集

resultRowSet.addRow(row.toArray.asInstanceOf[Array[Object]])

addRow调用RowBasedSet的addRow
调用ColumnValue.toTColumnValue，将相应object进行数据类型转换

  public static TColumnValue toTColumnValue(Type type, Object value) {
    switch (type) {
    case BOOLEAN_TYPE:
      return booleanValue((Boolean)value);
    case TINYINT_TYPE:
      return byteValue((Byte)value);
    case SMALLINT_TYPE:
      return shortValue((Short)value);
    case INT_TYPE:
      return intValue((Integer)value);
    case BIGINT_TYPE:
      return longValue((Long)value);
    case FLOAT_TYPE:
      return floatValue((Float)value);
    case DOUBLE_TYPE:
      return doubleValue((Double)value);
    case STRING_TYPE:
      return stringValue((String)value);
    case CHAR_TYPE:
      return stringValue((HiveChar)value);
    case VARCHAR_TYPE:
      return stringValue((HiveVarchar)value);
    case DATE_TYPE:
      return dateValue((Date)value);
    case TIMESTAMP_TYPE:
      return timestampValue((Timestamp)value);
    case INTERVAL_YEAR_MONTH_TYPE:
      return stringValue((HiveIntervalYearMonth) value);
    case INTERVAL_DAY_TIME_TYPE:
      return stringValue((HiveIntervalDayTime) value);
    case DECIMAL_TYPE:
      return stringValue(((HiveDecimal)value));
    case BINARY_TYPE:
      return stringValue((String)value);
    case ARRAY_TYPE:
    case MAP_TYPE:
    case STRUCT_TYPE:
    case UNION_TYPE:
    case USER_DEFINED_TYPE:
      return stringValue((String)value);
    default:
      return null;
    }
  }

总体启动调用逻辑

CLIService 父类：CompositeService

//位置：hive/hive-1.1.0-cdh5.7.0/service/src/java/org/apache/hive/service/cli/CLIService.java
  @Override
  public synchronized void start() {
    super.start();
  }

CompositeService，调用所有serviceList中的服务start方法，然后调用父类start

//位置：hive/hive-1.1.0-cdh5.7.0/service/src/java/org/apache/hive/service/CompositeService.java
  @Override
  public synchronized void start() {
    int i = 0;
    try {
      for (int n = serviceList.size(); i < n; i++) {
        Service service = serviceList.get(i);
        service.start();
      }
      super.start();
    } catch (Throwable e) {
      LOG.error("Error starting services " + getName(), e);
      // Note that the state of the failed service is still INITED and not
      // STARTED. Even though the last service is not started completely, still
      // call stop() on all services including failed service to make sure cleanup
      // happens.
      stop(i);
      throw new ServiceException("Failed to Start " + getName(), e);
    }
  }

SparkSQLCLIService的父类是CLIService，serviceList中包含SparkSQLSessionManager
SparkSQLSessionManager的父类是SessionManager，其父类是CompositeService，其serviceList包含SparkSQLOperationManager

参考

posted @ 2018-04-18 10:30 bigbigtree 阅读(1235) 评论(0) 收藏举报

刷新页面返回顶部

bigbigtree

SparkThriftServer 源码分析

版本

起点

客户端——Beeline

服务端

Hive-jdbc

TCLIService.Iface客户端请求

流程

SparkThrift

主函数HiveThriftServer2

ThriftHttpCLIService/ThriftBinaryCLIService

ThriftHttpCLIService

小结

SparkSQLCLIService

SparkSQLCLIService

SparkSQLSessionManager

SparkExecuteStatementOperation

成员变量

execute

getNextRowSet

总体启动调用逻辑

参考

公告