16.Spark Streaming源码解读之数据清理机制解析
eventLoop = new EventLoop[JobGeneratorEvent]("JobGenerator") {override protected def onReceive(event: JobGeneratorEvent): Unit = processEvent(event)override protected def onError(e: Throwable): Unit = {jobScheduler.reportError("Error in job generator", e)}}eventLoop.start()
其中的核心逻辑位于processEvent(event)函数中:
/** Processes all events */private def processEvent(event: JobGeneratorEvent) {logDebug("Got event " + event)event match {case GenerateJobs(time) => generateJobs(time)case ClearMetadata(time) => clearMetadata(time)case DoCheckpoint(time, clearCheckpointDataLater) =>doCheckpoint(time, clearCheckpointDataLater)case ClearCheckpointData(time) => clearCheckpointData(time)}}
可以看到当JobGenerator收到ClearMetadata(time) 和 ClearCheckpointData(time)是会进行相应的数据清理,其中 clearMetadata(time)会清理RDD数据和一些元数据信息, ClearCheckpointData(time)会清理Checkpoint数据。
二、数据清理过程详解
2.1 ClearMetaData 过程详解
首先看一下clearMetaData函数的处理逻辑:
/** Clear DStream metadata for the given `time`. */private def clearMetadata(time: Time) {ssc.graph.clearMetadata(time)// If checkpointing is enabled, then checkpoint,// else mark batch to be fully processedif (shouldCheckpoint) {eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = true))} else {// If checkpointing is not enabled, then delete metadata information about// received blocks (block data not saved in any case). Otherwise, wait for// checkpointing of this batch to complete.val maxRememberDuration = graph.getMaxInputStreamRememberDuration()jobScheduler.receiverTracker.cleanupOldBlocksAndBatches(time - maxRememberDuration)jobScheduler.inputInfoTracker.cleanup(time - maxRememberDuration)markBatchFullyProcessed(time)}}
首先调用了DStreamGraph的clearMetadata方法:
def clearMetadata(time: Time) {logDebug("Clearing metadata for time " + time)this.synchronized {outputStreams.foreach(_.clearMetadata(time))}logDebug("Cleared old metadata for time " + time)}
这里调用了所有OutputDStream (关于DStream 的分类请参考http://blog.csdn.net/zhouzx2010/article/details/51460790)的clearMetadata方法
private[streaming] def clearMetadata(time: Time) {val unpersistData = ssc.conf.getBoolean("spark.streaming.unpersist", true)//获取需要清理的RDDval oldRDDs = generatedRDDs.filter(_._1 <= (time - rememberDuration))logDebug("Clearing references to old RDDs: [" +oldRDDs.map(x => s"${x._1} -> ${x._2.id}").mkString(", ") + "]")//将要清除的RDD从generatedRDDs 中清除generatedRDDs --= oldRDDs.keysif (unpersistData) {logDebug(s"Unpersisting old RDDs: ${oldRDDs.values.map(_.id).mkString(", ")}")oldRDDs.values.foreach { rdd =>//将RDD 从persistence列表中移除rdd.unpersist(false)// Explicitly remove blocks of BlockRDDrdd match {case b: BlockRDD[_] =>logInfo(s"Removing blocks of RDD $b of time $time")//移除RDD的block 数据b.removeBlocks()case _ =>}}}logDebug(s"Cleared ${oldRDDs.size} RDDs that were older than " +s"${time - rememberDuration}: ${oldRDDs.keys.mkString(", ")}")//清除依赖的DStreamdependencies.foreach(_.clearMetadata(time))}
关键的清理逻辑在代码中做了详细注释,首先清理DStream对应的RDD的元数据信息,然后清理RDD的数据,最后对DStream所依赖的DStream进行清理。
回到JobGenerator的clearMetadata函数:
if (shouldCheckpoint) {eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = true))} else {// If checkpointing is not enabled, then delete metadata information about// received blocks (block data not saved in any case). Otherwise, wait for// checkpointing of this batch to complete.val maxRememberDuration = graph.getMaxInputStreamRememberDuration()jobScheduler.receiverTracker.cleanupOldBlocksAndBatches(time - maxRememberDuration)jobScheduler.inputInfoTracker.cleanup(time - maxRememberDuration)markBatchFullyProcessed(time)}
调用了ReceiverTracker的 cleanupOldBlocksAndBatches方法,最后调用了clearupOldBatches方法:
def cleanupOldBatches(cleanupThreshTime: Time, waitForCompletion: Boolean): Unit = synchronized {require(cleanupThreshTime.milliseconds < clock.getTimeMillis())val timesToCleanup = timeToAllocatedBlocks.keys.filter { _ < cleanupThreshTime }.toSeqlogInfo(s"Deleting batches: ${timesToCleanup.mkString(" ")}")if (writeToLog(BatchCleanupEvent(timesToCleanup))) {//将要删除的Batch数据清除timeToAllocatedBlocks --= timesToCleanup//清理WAL日志writeAheadLogOption.foreach(_.clean(cleanupThreshTime.milliseconds, waitForCompletion))} else {logWarning("Failed to acknowledge batch clean up in the Write Ahead Log.")}}
可以看到ReceiverTracker的clearupOldBatches方法清理了Receiver数据,也就是Batch数据和WAL日志数据。
最后对InputInfoTracker信息进行清理:
def cleanup(batchThreshTime: Time): Unit = synchronized {val timesToCleanup = batchTimeToInputInfos.keys.filter(_ < batchThreshTime)logInfo(s"remove old batch metadata: ${timesToCleanup.mkString(" ")}")batchTimeToInputInfos --= timesToCleanup}
这简单的清除了batchTimeToInputInfos 的输入信息。
2.2 ClearCheckPoint 过程详解
看一下clearCheckpointData的处理逻辑:
/** Clear DStream checkpoint data for the given `time`. */private def clearCheckpointData(time: Time) {ssc.graph.clearCheckpointData(time)// All the checkpoint information about which batches have been processed, etc have// been saved to checkpoints, so its safe to delete block metadata and data WAL filesval maxRememberDuration = graph.getMaxInputStreamRememberDuration()jobScheduler.receiverTracker.cleanupOldBlocksAndBatches(time - maxRememberDuration)jobScheduler.inputInfoTracker.cleanup(time - maxRememberDuration)markBatchFullyProcessed(time)}
后面的ReceiverTraker和InputInforTracker的清理逻辑和ClearMetaData的相同,这分析DStreamGraph的clearCheckpointData方法:
def clearCheckpointData(time: Time) {logInfo("Clearing checkpoint data for time " + time)this.synchronized {outputStreams.foreach(_.clearCheckpointData(time))}logInfo("Cleared checkpoint data for time " + time)}
同样的调用了DStreamGraph中所有OutputDStream的clearCheckPiontData 方法:
private[streaming] def clearCheckpointData(time: Time) {logDebug("Clearing checkpoint data")checkpointData.cleanup(time)dependencies.foreach(_.clearCheckpointData(time))logDebug("Cleared checkpoint data")}
这里的核心逻辑在checkpointData.cleanup(time)方法,这里的CheckpointData 是 DStreamCheckpointData对象, DStreamCheckpointData的clearup方法如下:
def cleanup(time: Time) {// 获取需要清理的Checkpoint 文件 时间timeToOldestCheckpointFileTime.remove(time) match {case Some(lastCheckpointFileTime) =>//获取需要删除的文件val filesToDelete = timeToCheckpointFile.filter(_._1 < lastCheckpointFileTime)logDebug("Files to delete:\n" + filesToDelete.mkString(","))filesToDelete.foreach {case (time, file) =>try {val path = new Path(file)if (fileSystem == null) {fileSystem = path.getFileSystem(dstream.ssc.sparkContext.hadoopConfiguration)}//删除文件fileSystem.delete(path, true)timeToCheckpointFile -= timelogInfo("Deleted checkpoint file '" + file + "' for time " + time)} catch {case e: Exception =>logWarning("Error deleting old checkpoint file '" + file + "' for time " + time, e)fileSystem = null}}case None =>logDebug("Nothing to delete")}}
可以看到checkpoint的清理,就是删除了指定时间以前的checkpoint文件。
三、数据清理的触发
3.1 ClearMetaData 过程的触发
JobGenerator 生成job后,交给JobHandler执行, JobHandler的run方法中,会在job执行完后给JobScheduler 发送JobCompleted消息:
_eventLoop = eventLoopif (_eventLoop != null) {_eventLoop.post(JobCompleted(job, clock.getTimeMillis()))}
JobScheduler 收到JobCompleted 消息调用 handleJobCompletion 方法,源码如下:
private def processEvent(event: JobSchedulerEvent) {try {event match {case JobStarted(job, startTime) => handleJobStart(job, startTime)case JobCompleted(job, completedTime) => handleJobCompletion(job, completedTime)case ErrorReported(m, e) => handleError(m, e)}} catch {case e: Throwable =>reportError("Error in job scheduler", e)}}
在 JobScheduler 的handleJobCompletion方法中会调用JobGenerator的onBatchCompletion方法,我们看一下JobGenerator的 onBatchCompletion 方法的源码:
def onBatchCompletion(time: Time) {eventLoop.post(ClearMetadata(time))}
可以看到JobGenerator的onBatchCompletion方法给自己发送了ClearMetadata消息从而触发了ClearMetaData操作。
3.2 ClearCheckPoint 过程的触发
清理CheckPoint数据发生在CheckPoint完成之后,我们先看一下CheckPointHandler的run方法:
// All done, print successval finishTime = System.currentTimeMillis()logInfo("Checkpoint for time " + checkpointTime + " saved to file '" + checkpointFile +"', took " + bytes.length + " bytes and " + (finishTime - startTime) + " ms")//调用JobGenerator的方法进行checkpoint数据清理jobGenerator.onCheckpointCompletion(checkpointTime, clearCheckpointDataLater)return
可以看到在checkpoint完成后,会调用JobGenerator的onCheckpointCompletion方法进行checkpoint数据清理,我查看JobGenerator的onCheckpointCompletion方法源码:
def onCheckpointCompletion(time: Time, clearCheckpointDataLater: Boolean) {if (clearCheckpointDataLater) {eventLoop.post(ClearCheckpointData(time))}}
可以看到JobGenerator的onCheckpointCompletion方法中首先对传进来的 clearCheckpointDataLater 参数进行判断,如果该参数为true,就会给JobGenerator的eventLoop循环体发送ClearCheckpointData消息,从而触发clearCheckpointData 方法的调用,进行Checkpoint数据的清理。
什么时候该参数会true呢?
我们回到JobGenerator的 ClearMetadata 方法:
private def clearMetadata(time: Time) {ssc.graph.clearMetadata(time)if (shouldCheckpoint) {//发送DoCheckpoint消息,并进行相应的Checkpoint数据清理eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = true))} else {val maxRememberDuration = graph.getMaxInputStreamRememberDuration()jobScheduler.receiverTracker.cleanupOldBlocksAndBatches(time - maxRememberDuration)jobScheduler.inputInfoTracker.cleanup(time - maxRememberDuration)markBatchFullyProcessed(time)}}
可以看到在clearMetadata方法中,发送了DoCheckpoint消息,其中参数 clearCheckpointDataLater 为ture。Generator的eventLoop收到该消息后调用 doCheckpoint 方法:
private def doCheckpoint(time: Time, clearCheckpointDataLater: Boolean) {if (shouldCheckpoint && (time - graph.zeroTime).isMultipleOf(ssc.checkpointDuration)) {logInfo("Checkpointing graph for time " + time)ssc.graph.updateCheckpointData(time)checkpointWriter.write(new Checkpoint(ssc, time), clearCheckpointDataLater)}}
这里关键一步:调用了CheckpointWriter的write方法,注意此时参数 clearCheckpointDataLater 为true。我们进入该方法:
def write(checkpoint: Checkpoint, clearCheckpointDataLater: Boolean) {try {val bytes = Checkpoint.serialize(checkpoint, conf)- //将参数clearCheckpointDataLater传入CheckpoitWriteHandler
executor.execute(new CheckpointWriteHandler(checkpoint.checkpointTime, bytes, clearCheckpointDataLater))logInfo("Submitted checkpoint of time " + checkpoint.checkpointTime + " writer queue")} catch {case rej: RejectedExecutionException =>logError("Could not submit checkpoint task to the thread pool executor", rej)}}
可以看到此时参数 clearCheckpointDataLater 传入CheckpointWriteHandler 。这样Checkpoint完成之后就会发送ClearCheckpointData消息给JobGenerator进行Checkpoint数据的清理。
浙公网安备 33010602011771号