Akka源码分析-Akka-Streams-Materializer(1)

  本博客逐步分析Akka Streams的源码,当然必须循序渐进,且估计会分很多篇,毕竟Akka Streams还是比较复杂的。

implicit val system = ActorSystem("QuickStart")

implicit val materializer = ActorMaterializer()

   在使用Streams相关的API时,上面两个对象是必须创建的。ActorSystem不再说了,我们来看ActorMaterializer。

/**
 * An ActorMaterializer takes a stream blueprint and turns it into a running stream.
 */
abstract class ActorMaterializer extends Materializer with MaterializerLoggingProvider 

   ActorMaterializer把一个流式计算的blueprint(大纲、蓝本?)转换成一个运行的流,简单来说这就是用来编译akka提供的流式API的。它继承一个类和一个特质。MaterializerLoggingProvider就不看了,就是提供日志相关的功能的。

/**
 * Materializer SPI (Service Provider Interface)
 *
 * Binary compatibility is NOT guaranteed on materializer internals.
 *
 * Custom materializer implementations should be aware that the materializer SPI
 * is not yet final and may change in patch releases of Akka. Please note that this
 * does not impact end-users of Akka streams, only implementors of custom materializers,
 * with whom the Akka team co-ordinates such changes.
 *
 * Once the SPI is final this notice will be removed.
 */
abstract class Materializer {

  /**
   * The `namePrefix` shall be used for deriving the names of processing
   * entities that are created during materialization. This is meant to aid
   * logging and failure reporting both during materialization and while the
   * stream is running.
   */
  def withNamePrefix(name: String): Materializer

  /**
   * This method interprets the given Flow description and creates the running
   * stream. The result can be highly implementation specific, ranging from
   * local actor chains to remote-deployed processing networks.
   */
  def materialize[Mat](runnable: Graph[ClosedShape, Mat]): Mat

  /**
   * This method interprets the given Flow description and creates the running
   * stream using an explicitly provided [[Attributes]] as top level (least specific) attributes that
   * will be defaults for the materialized stream.
   * The result can be highly implementation specific, ranging from local actor chains to remote-deployed
   * processing networks.
   */
  def materialize[Mat](
    runnable:                                              Graph[ClosedShape, Mat],
    @deprecatedName('initialAttributes) defaultAttributes: Attributes): Mat

  /**
   * Running a flow graph will require execution resources, as will computations
   * within Sources, Sinks, etc. This [[scala.concurrent.ExecutionContextExecutor]]
   * can be used by parts of the flow to submit processing jobs for execution,
   * run Future callbacks, etc.
   *
   * Note that this is not necessarily the same execution context the stream operator itself is running on.
   */
  implicit def executionContext: ExecutionContextExecutor

  /**
   * Interface for operators that need timer services for their functionality. Schedules a
   * single task with the given delay.
   *
   * @return A [[akka.actor.Cancellable]] that allows cancelling the timer. Cancelling is best effort, if the event
   *         has been already enqueued it will not have an effect.
   */
  def scheduleOnce(delay: FiniteDuration, task: Runnable): Cancellable

  /**
   * Interface for operators that need timer services for their functionality. Schedules a
   * repeated task with the given interval between invocations.
   *
   * @return A [[akka.actor.Cancellable]] that allows cancelling the timer. Cancelling is best effort, if the event
   *         has been already enqueued it will not have an effect.
   */
  def schedulePeriodically(initialDelay: FiniteDuration, interval: FiniteDuration, task: Runnable): Cancellable

}

   Materializer非常重要,所以这里贴出完整代码。Materializer是一个SPI(服务提供接口),Materializer内部不保证二进制兼容,也就是说版本可能不兼容。但这个定义比较奇怪,既然所有的方法都没有实现,那是不是用trait比较合适呢?为啥是一个抽象类呢?

  Materializer一共6个方法,我们一个个看。

  withNamePrefix在派生处理实体的时候用到,简单来说就是流中的每个计算实体(Source/Sink/Flow/RunnableGraph)等映射成一个actor的时候,这些actor名称需要一个前缀,withNamePrefix就是用来设置这个前缀的。

  materialize方法有两个实现。这个方法用来解释给定的Flow定义,并创建运行的流的。创建的结果会高度依赖具体的实现,可能是本地的actor链也可能是远程部署的加工过程。

  executionContext用来提供异步执行环境。

  scheduleOnce/schedulePeriodically用来提供定时调度的功能,毕竟还是有一些操作需要时间服务的。

  Materializer貌似很简单,就只有上面几个接口,但其核心的接口就是materialize方法,毕竟这是用来编译akka流的。如果你使用过storm的话,就一定知道Storm有一个拓扑编译器。其功能跟这个差不多。

  我们再来看ActorMaterializer。

  ActorMaterializer还提供了三个比较重要的接口:actorOf/system/supervisor。其中actorOf接收MaterializationContext、Props作为参数,创建一个Actor;system返回与ActorMaterializer关联的ActorSystem;supervisor功能后面再研究。MaterializationContext这个参数还是比较有意思的,可以理解成物理化(编译流)时的上下文。

/**
 * Context parameter to the `create` methods of sources and sinks.
 *
 * INTERNAL API
 */
@InternalApi
private[akka] case class MaterializationContext(
  materializer:        Materializer,
  effectiveAttributes: Attributes,
  islandName:          String)

   MaterializationContext一共三个变量,materializer不再说了,就是当前上下文关联的Materializer。effectiveAttributes就是用来提供参数的,只不过又封装成了Attributes。islandName比较有意思,单纯从命名上来翻译,它是“岛名称”。那什么是“岛”(island)呢?后面分析时,也会遇到这个“island”,大家这里稍微留意一下就行了。

  另外在Materializer这个抽象类中materialize方法有一个参数需要我们研究下:runnable: Graph[ClosedShape, Mat]。

/**
 * Not intended to be directly extended by user classes
 *
 * @see [[akka.stream.stage.GraphStage]]
 */
trait Graph[+S <: Shape, +M]

   Graph是一个trait,它有两个类型参数:S/M。能不能起个好点的名字,SM,哈哈。其中S是Shape的子类,从名称来看,好像是graph的形状。简单分析一下Graph的主体代码,就会发现它有两个方法特别重要:shape、traversalBuilder。其中shape返回图的形状,虽然我们还不知道形状是什么,但ClosedShape是它的一个形式;traversalBuilder

/**
   * INTERNAL API.
   *
   * Every materializable element must be backed by a stream layout module
   */
  private[stream] def traversalBuilder: TraversalBuilder

   traversalBuilder返回一个TraversalBuilder,注释说每一个可物理化的元素都必须被流布局模块支持。TraversalBuilder从名称来看,是一个可遍历的编译器。估计就是一个拓扑排序。

  下面我们来看Shape究竟是什么。

/**
 * A Shape describes the inlets and outlets of a [[Graph]]. In keeping with the
 * philosophy that a Graph is a freely reusable blueprint, everything that
 * matters from the outside are the connections that can be made with it,
 * otherwise it is just a black box.
 */
abstract class Shape 

   一个Shape描述了Graph的入口和出口,按照Graph是自由重用的蓝本这样的哲学,从外部来看Graph是可被链接的就非常重要了,否则它就只是一个黑盒子。我们姑且认为它只是用来定义Graph的输入输出的吧。而且这个trait最重要的两个方法也就是inlets/outlets。

  /**
   * Scala API: get a list of all input ports
   */
  def inlets: immutable.Seq[Inlet[_]]

  /**
   * Scala API: get a list of all output ports
   */
  def outlets: immutable.Seq[Outlet[_]]

  那Inlet和Outlet又是什么呢?

final class Inlet[T] private (val s: String) extends InPort {
  def carbonCopy(): Inlet[T] = {
    val in = Inlet[T](s)
    in.mappedTo = this
    in
  }
  /**
   * INTERNAL API.
   */
  def as[U]: Inlet[U] = this.asInstanceOf[Inlet[U]]

  override def toString: String = s + "(" + this.hashCode + s")" +
    (if (mappedTo eq this) ""
    else s" mapped to $mappedTo")
}

/**
 * An input port of a StreamLayout.Module. This type logically belongs
 * into the impl package but must live here due to how `sealed` works.
 * It is also used in the Java DSL for “untyped Inlets” as a work-around
 * for otherwise unreasonable existential types.
 */
sealed abstract class InPort { self: Inlet[_] ⇒
  final override def hashCode: Int = super.hashCode
  final override def equals(that: Any): Boolean = this eq that.asInstanceOf[AnyRef]

  /**
   * INTERNAL API
   */
  @volatile private[stream] var id: Int = -1

  /**
   * INTERNAL API
   */
  @volatile private[stream] var mappedTo: InPort = this

  /**
   * INTERNAL API
   */
  private[stream] def inlet: Inlet[_] = this
}

   Inlet好像也没有提供什么方法啊,貌似目前来看比较重要的就只有一个id字段,其他的都是变量返回、复制的。

  好像分析到这里,Materializer、Shape、Graph都还比较抽象,还得来看Materializer的具体实现,毕竟它只是一个trait。

 def apply(materializerSettings: ActorMaterializerSettings, namePrefix: String)(implicit context: ActorRefFactory): ActorMaterializer = {
    val haveShutDown = new AtomicBoolean(false)
    val system = actorSystemOf(context)

    new PhasedFusingActorMaterializer(
      system,
      materializerSettings,
      system.dispatchers,
      actorOfStreamSupervisor(materializerSettings, context, haveShutDown),
      haveShutDown,
      FlowNames(system).name.copy(namePrefix))
  }

   通过分析ActorMaterializer的apply方法,我们发现最终调用了上面这个版本的apply,可以看到,最终创建了PhasedFusingActorMaterializer,而且是new出来的,所以这个类一定是一个具体类,也就是说所有的方法和字段都有对应的实现和定义。PhasedFusingActorMaterializer的名称其实还挺有意思的,直译的话就是分段熔断actor物化器。分段好理解,熔断就不知道怎么理解了,或者翻译错了?哈哈,我也不知道。

@InternalApi private[akka] case class PhasedFusingActorMaterializer(
  system:                ActorSystem,
  override val settings: ActorMaterializerSettings,
  dispatchers:           Dispatchers,
  supervisor:            ActorRef,
  haveShutDown:          AtomicBoolean,
  flowNames:             SeqActorName) extends ExtendedActorMaterializer 

   PhasedFusingActorMaterializer居然是一个case class,那还new干啥。它继承了ExtendedActorMaterializer。ExtendedActorMaterializer源码不再贴出来,它就是重新覆盖了ActorMaterializer的几个方法,并实现了一个方法actorOf。我们就来看这个actorOf

@InternalApi private[akka] override def actorOf(context: MaterializationContext, props: Props): ActorRef = {
    val effectiveProps = props.dispatcher match {
      case Dispatchers.DefaultDispatcherId ⇒
        props.withDispatcher(context.effectiveAttributes.mandatoryAttribute[ActorAttributes.Dispatcher].dispatcher)
      case ActorAttributes.IODispatcher.dispatcher ⇒
        // this one is actually not a dispatcher but a relative config key pointing containing the actual dispatcher name
        props.withDispatcher(settings.blockingIoDispatcher)
      case _ ⇒ props
    }

    actorOf(effectiveProps, context.islandName)
  }

   其实它就是用特定的dispatcher替换了Props中的值,然后调用另一个actorOf创建actor。

@InternalApi private[akka] def actorOf(props: Props, name: String): ActorRef = {
    supervisor match {
      case ref: LocalActorRef ⇒
        ref.underlying.attachChild(props, name, systemService = false)
      case ref: RepointableActorRef ⇒
        if (ref.isStarted)
          ref.underlying.asInstanceOf[ActorCell].attachChild(props, name, systemService = false)
        else {
          implicit val timeout = ref.system.settings.CreationTimeout
          val f = (supervisor ? StreamSupervisor.Materialize(props, name)).mapTo[ActorRef]
          Await.result(f, timeout.duration)
        }
      case unknown ⇒
        throw new IllegalStateException(s"Stream supervisor must be a local actor, was [${unknown.getClass.getName}]")
    }
  }

   这个actorOf我们也不深入分析了,反正就是在创建actor。下面我们来看materialize在PhasedFusingActorMaterializer中的实现。

 

override def materialize[Mat](
    graph:             Graph[ClosedShape, Mat],
    defaultAttributes: Attributes,
    defaultPhase:      Phase[Any],
    phases:            Map[IslandTag, Phase[Any]]): Mat = {
    if (isShutdown) throw new IllegalStateException("Trying to materialize stream after materializer has been shutdown")
    val islandTracking = new IslandTracking(phases, settings, defaultAttributes, defaultPhase, this, islandNamePrefix = createFlowName() + "-")

    var current: Traversal = graph.traversalBuilder.traversal

    val attributesStack = new java.util.ArrayDeque[Attributes](8)
    attributesStack.addLast(defaultAttributes and graph.traversalBuilder.attributes)

    val traversalStack = new java.util.ArrayDeque[Traversal](16)
    traversalStack.addLast(current)

    val matValueStack = new java.util.ArrayDeque[Any](8)

    if (Debug) {
      println(s"--- Materializing layout:")
      TraversalBuilder.printTraversal(current)
      println(s"--- Start materialization")
    }

    // Due to how Concat works, we need a stack. This probably can be optimized for the most common cases.
    while (!traversalStack.isEmpty) {
      current = traversalStack.removeLast()

      while (current ne EmptyTraversal) {
        var nextStep: Traversal = EmptyTraversal
        current match {
          case MaterializeAtomic(mod, outToSlot) ⇒
            if (Debug) println(s"materializing module: $mod")
            val matAndStage = islandTracking.getCurrentPhase.materializeAtomic(mod, attributesStack.getLast)
            val logic = matAndStage._1
            val matValue = matAndStage._2
            if (Debug) println(s"  materialized value is $matValue")
            matValueStack.addLast(matValue)

            val stageGlobalOffset = islandTracking.getCurrentOffset

            wireInlets(islandTracking, mod, logic)
            wireOutlets(islandTracking, mod, logic, stageGlobalOffset, outToSlot)

            if (Debug) println(s"PUSH: $matValue => $matValueStack")

          case Concat(first, next) ⇒
            if (next ne EmptyTraversal) traversalStack.add(next)
            nextStep = first
          case Pop ⇒
            val popped = matValueStack.removeLast()
            if (Debug) println(s"POP: $popped => $matValueStack")
          case PushNotUsed ⇒
            matValueStack.addLast(NotUsed)
            if (Debug) println(s"PUSH: NotUsed => $matValueStack")
          case transform: Transform ⇒
            val prev = matValueStack.removeLast()
            val result = transform(prev)
            matValueStack.addLast(result)
            if (Debug) println(s"TRFM: $matValueStack")
          case compose: Compose ⇒
            val second = matValueStack.removeLast()
            val first = matValueStack.removeLast()
            val result = compose(first, second)
            matValueStack.addLast(result)
            if (Debug) println(s"COMP: $matValueStack")
          case PushAttributes(attr) ⇒
            attributesStack.addLast(attributesStack.getLast and attr)
            if (Debug) println(s"ATTR PUSH: $attr")
          case PopAttributes ⇒
            attributesStack.removeLast()
            if (Debug) println(s"ATTR POP")
          case EnterIsland(tag) ⇒
            islandTracking.enterIsland(tag, attributesStack.getLast)
          case ExitIsland ⇒
            islandTracking.exitIsland()
          case _ ⇒
        }
        current = nextStep
      }
    }

    def shutdownWhileMaterializingFailure =
      new IllegalStateException("Materializer shutdown while materializing stream")
    try {
      islandTracking.getCurrentPhase.onIslandReady()
      islandTracking.allNestedIslandsReady()

      if (Debug) println("--- Finished materialization")
      matValueStack.peekLast().asInstanceOf[Mat]

    } finally {
      if (isShutdown) throw shutdownWhileMaterializingFailure
    }

  }

   由于这个方法过于重要,所以我还是贴出了整段代码。我们发现它有两个参数没有见过:defaultPhase: Phase[Any]、phases: Map[IslandTag, Phase[Any]]。这涉及到两个类型IslandTag、Phase。Phase比较好理解就是阶段,那IslandTag是啥呢?“岛”的标签?“岛”是什么?!

@DoNotInherit private[akka] trait Phase[M] {
  def apply(
    settings:            ActorMaterializerSettings,
    effectiveAttributes: Attributes,
    materializer:        PhasedFusingActorMaterializer,
    islandName:          String): PhaseIsland[M]
}

   Phase这个trait就只有一个apply,他就是根据各个参数,返回了PhaseIsland,阶段岛?

@DoNotInherit private[akka] trait PhaseIsland[M] {

  def name: String

  def materializeAtomic(mod: AtomicModule[Shape, Any], attributes: Attributes): (M, Any)

  def assignPort(in: InPort, slot: Int, logic: M): Unit

  def assignPort(out: OutPort, slot: Int, logic: M): Unit

  def createPublisher(out: OutPort, logic: M): Publisher[Any]

  def takePublisher(slot: Int, publisher: Publisher[Any]): Unit

  def onIslandReady(): Unit

}

   PhaseIsland有两个方法非常重要:createPublisher、takePublisher。为啥重要?因为它好像在创建Publisher啊,Publisher是啥?翻翻我上一篇博客喽?这是reactivestreams里面的接口,之所以重要因为它涉及到了底层。

  还有一个IslandTag,这又是啥呢?

@DoNotInherit private[akka] trait IslandTag

   啥元素也没有,难道仅仅是用来做类型匹配的?鬼知道。估计是给Island打标签的。

  materialize方法代码很多,逻辑很复杂,但有几个重要的点需要注意。它创建了一个IslandTracking,并通过一个while循环访问了graph.traversalBuilder,对IslandTracking进行了某些操作,最后调用了islandTracking的两个方法。

islandTracking.getCurrentPhase.onIslandReady()
islandTracking.allNestedIslandsReady()

   所以IslandTracking和这两个方法就非常重要了,因为这个类的这两个方法调用完之后,流就可以正常运行了啊。

@InternalApi private[akka] class IslandTracking(
  val phases:       Map[IslandTag, Phase[Any]],
  val settings:     ActorMaterializerSettings,
  attributes:       Attributes,
  defaultPhase:     Phase[Any],
  val materializer: PhasedFusingActorMaterializer,
  islandNamePrefix: String) 

   IslandTracking岛跟踪,居然一行注释都没有,这让我怎么分析啊,麻蛋!

  由于时间关系,今天就先分析到这里吧,很显然Akka Streams的代码很复杂,看源码非常有难度,希望我还能看下去。哈哈。

  

posted @ 2018-08-23 18:21  gabry.wu  阅读(...)  评论(...编辑  收藏