fxjwind Calcite分析 - Volcano模型 - fxjwind

fxjwind Calcite分析 - Volcano模型

参考，https://matt33.com/2019/03/17/apache-calcite-planner/

Volcano模型使用，分为下面几个步骤，

//1. 初始化
VolcanoPlanner planner = new VolcanoPlanner();
//2.addRelTrait
planner.addRelTraitDef(ConventionTraitDef.INSTANCE);
planner.addRelTraitDef(RelDistributionTraitDef.INSTANCE);
//3.添加rule, logic to logic
planner.addRule(FilterJoinRule.FilterIntoJoinRule.FILTER_ON_JOIN);
planner.addRule(ReduceExpressionsRule.PROJECT_INSTANCE);

//4.添加ConverterRule, logic to physical
planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
planner.addRule(EnumerableRules.ENUMERABLE_SORT_RULE);
//5. setRoot 方法注册相应的RelNode
planner.setRoot(relNode);
//6. find best plan
relNode = planner.findBestExp();

1和2 初始化

addRelTraitDef，就是把traitDef加到这个结构里面

  /**
   * Holds the currently registered RelTraitDefs.
   */
  private final List<RelTraitDef> traitDefs = new ArrayList<>();

3. 增加Rule

  public boolean addRule(RelOptRule rule) {
    //加到ruleSet
    final boolean added = ruleSet.add(rule);
    mapRuleDescription(rule);

    // Each of this rule's operands is an 'entry point' for a rule call.
    // Register each operand against all concrete sub-classes that could match
    // it.
    for (RelOptRuleOperand operand : rule.getOperands()) {
      for (Class<? extends RelNode> subClass
          : subClasses(operand.getMatchedClass())) {
        classOperands.put(subClass, operand);
      }
    }

    // If this is a converter rule, check that it operates on one of the
    // kinds of trait we are interested in, and if so, register the rule
    // with the trait.
    if (rule instanceof ConverterRule) {
      ConverterRule converterRule = (ConverterRule) rule;
      final RelTrait ruleTrait = converterRule.getInTrait();
      final RelTraitDef ruleTraitDef = ruleTrait.getTraitDef();
      if (traitDefs.contains(ruleTraitDef)) {
        ruleTraitDef.registerConverterRule(this, converterRule);
      }
    }
    return true;
  }

a. 更新classOperands

记录Relnode和Rule的match关系，

multimap，一个relnode可以对应于多条rule的operand

  /**
   * Operands that apply to a given class of {@link RelNode}.
   *
   * <p>Any operand can be an 'entry point' to a rule call, when a RelNode is
   * registered which matches the operand. This map allows us to narrow down
   * operands based on the class of the RelNode.</p>
   */
  private final Multimap<Class<? extends RelNode>, RelOptRuleOperand>
      classOperands = LinkedListMultimap.create();

首先取出rule的operands，

/**
 * Operand that determines whether a {@link RelOptRule}
 * can be applied to a particular expression.
 *
 * <p>For example, the rule to pull a filter up from the left side of a join
 * takes operands: <code>Join(Filter, Any)</code>.</p>
 *
 * <p>Note that <code>children</code> means different things if it is empty or
 * it is <code>null</code>: <code>Join(Filter <b>()</b>, Any)</code> means
 * that, to match the rule, <code>Filter</code> must have no operands.</p>
 */
public class RelOptRuleOperand {
  //~ Instance fields --------------------------------------------------------

  private RelOptRuleOperand parent;
  private RelOptRule rule;
  private final Predicate<RelNode> predicate;

  // REVIEW jvs 29-Aug-2004: some of these are Volcano-specific and should be
  // factored out
  public int[] solveOrder;
  public int ordinalInParent;
  public int ordinalInRule;
  public final RelTrait trait;
  private final Class<? extends RelNode> clazz;
  private final ImmutableList<RelOptRuleOperand> children;

operands包含所有rule中的operand，以flatten的方式

operands，操作符，用于表示rule适用于何种表达，Relnode的封装？

比如，Join(Filter, Any)，operands应该包含，Join，Filter，Any

operand.getMatchedClass，operand对于的relnode的类型，比如Filter

subClasses，这个函数如下，

  /** Returns sub-classes of relational expression. */
  public Iterable<Class<? extends RelNode>> subClasses(
      final Class<? extends RelNode> clazz) {
    return Util.filter(classes, c -> {
      // RelSubset must be exact type, not subclass
      if (c == RelSubset.class) { //对于RelSubset，需要类型相等
        return c == clazz;
      }
      return clazz.isAssignableFrom(c); //c的派生类
    });
  }

classes，这个结构会记录plan中所有RelNode的class

AbstractRelOptPlanner

private final Set<Class<? extends RelNode>> classes = new HashSet<>();

初始化的时候，先加入基类

    // Add abstract RelNode classes. No RelNodes will ever be registered with
    // these types, but some operands may use them.
    classes.add(RelNode.class);
    classes.add(RelSubset.class);

后续通过registerClass函数注册，这个会在registerImpl中被调用注册新的RelNode；subClasses就是找出当前plan的classes中有哪些class和当前Operand相关的

理论上，如果是先addRule，后setRoot的话，这个时候classes只有基类

所以这里的逻辑是，找出哪些Rule中operands的class和plan中的RelNode的class是可以匹配上的

把这个对应关系加到classOperands，有了这个关系，我们后面在遍历的时候，就知道有哪些rule和这个RelNode可能会match上，缩小搜索空间

这里记录的是operand，不是rule，从operand本身可以取到rule

b. 将ConverterRule注册到RelTraitDef

    // If this is a converter rule, check that it operates on one of the
    // kinds of trait we are interested in, and if so, register the rule
    // with the trait.
    if (rule instanceof ConverterRule) {
      ConverterRule converterRule = (ConverterRule) rule;

      final RelTrait ruleTrait = converterRule.getInTrait();
      final RelTraitDef ruleTraitDef = ruleTrait.getTraitDef();
      if (traitDefs.contains(ruleTraitDef)) {
        ruleTraitDef.registerConverterRule(this, converterRule);
      }
    }

首先看下什么是ConverterRule

个人理解，从一种Trait转换到另一种Trait，保持语义不变，即inTrait和outTrait的def必须要一样

比如如果是排序，只能从一种排序到另一种排序，排序的选择是不影响语义的，对于distribution也是一样

/**
 * Abstract base class for a rule which converts from one calling convention to
 * another without changing semantics.
 */
public abstract class ConverterRule
    extends RelRule<ConverterRule.Config> {
  //~ Instance fields --------------------------------------------------------

  private final RelTrait inTrait;
  private final RelTrait outTrait;
  protected final Convention out;

  //~ Constructors -----------------------------------------------------------

  /** Creates a <code>ConverterRule</code>. */
  protected ConverterRule(Config config) {
    super(config);
    this.inTrait = Objects.requireNonNull(config.inTrait());
    this.outTrait = Objects.requireNonNull(config.outTrait());

    // Source and target traits must have same type
    assert inTrait.getTraitDef() == outTrait.getTraitDef();

    // Most sub-classes are concerned with converting one convention to
    // another, and for them, the "out" field is a convenient short-cut.
    this.out = outTrait instanceof Convention ? (Convention) outTrait
        : null;
  }

这里直接的意思，如果ConverterRule的inTrait的Def和Plan中注册的traitDef相同时，需要注册一下这个Rule

5. SetRoot

  public void setRoot(RelNode rel) {
    // We've registered all the rules, and therefore RelNode classes,
    // we're interested in, and have not yet started calling metadata providers.
    // So now is a good time to tell the metadata layer what to expect.
    registerMetadataRels();

    this.root = registerImpl(rel, null);
    if (this.originalRoot == null) {
      this.originalRoot = rel;
    }

    rootConvention = this.root.getConvention(); //root的ConventionDef对应的trait
    ensureRootConverters();
  }

核心调用是 registerImpl

  /**
   * Registers a new expression <code>exp</code> and queues up rule matches.
   * If <code>set</code> is not null, makes the expression part of that
   * equivalence set. If an identical expression is already registered, we
   * don't need to register this one and nor should we queue up rule matches.
   *
   * @param rel relational expression to register. Must be either a
   *         {@link RelSubset}, or an unregistered {@link RelNode}
   * @param set set that rel belongs to, or <code>null</code>
   * @return the equivalence-set
   */
  private RelSubset registerImpl(
      RelNode rel,
      RelSet set) {
    //刚开始set为null，但是在递归调用时传入rel应当属于的set，如果rel是subset已经注册过，要把两个set合并
    if (rel instanceof RelSubset) {
      return registerSubset(set, (RelSubset) rel);
    }

    // Ensure that its sub-expressions are registered.
    // 1. 递归注册
    rel = rel.onRegister(this);

    // 2. 记录下该RelNode是由哪个Rule Call产生的
    if (ruleCallStack.isEmpty()) {
      provenanceMap.put(rel, Provenance.EMPTY);
    } else {
      final VolcanoRuleCall ruleCall = ruleCallStack.peek();
      provenanceMap.put(
          rel,
          new RuleProvenance(
              ruleCall.rule,
              ImmutableList.copyOf(ruleCall.rels),
              ruleCall.id));
    }

    // 3. 注册RelNode树的class和trait
    registerClass(rel);

    registerCount++;

    //4. 注册RelNode到RelSet
    final int subsetBeforeCount = set.subsets.size();
    RelSubset subset = addRelToSet(rel, set);

    final RelNode xx = mapDigestToRel.put(key, rel);

    // 5. 更新importance
    if (rel == this.root) {
      ruleQueue.subsetImportances.put(
          subset,
          1.0); // root的importance固定为1
    }

    //把inputs也加入到RelSubset里面
    for (RelNode input : rel.getInputs()) {
      RelSubset childSubset = (RelSubset) input;
      childSubset.set.parents.add(rel);

      // 由于调整了RelSubset结构，重新计算importance
      ruleQueue.recompute(childSubset);
    }

    // 6. Fire rules
    fireRules(rel, true);

    // It's a new subset.
    if (set.subsets.size() > subsetBeforeCount) {
      fireRules(subset, true);
    }

    return subset;
  }

5.1 rel = rel.onRegister(this)

onRegister，目的就是递归的对RelNode树上的每个节点调用registerImpl，特别注意这里是递归调用，所以对每个RelNode都会调用

取出RelNode的inputs，这里bottom up的，join的inputs就是left，right children

然后对于每个input，调用ensureRegistered

最终这个函数返回的是，经过注册的Root，应该是Subset，并且Root的子树也应该是完成注册的subsets

  public RelNode onRegister(RelOptPlanner planner) {
    List<RelNode> oldInputs = getInputs();
    List<RelNode> inputs = new ArrayList<>(oldInputs.size());
    for (final RelNode input : oldInputs) {
      RelNode e = planner.ensureRegistered(input, null);
      }
      inputs.add(e);
    }
    RelNode r = this;
    if (!Util.equalShallow(oldInputs, inputs)) {
      r = copy(getTraitSet(), inputs);
    }
    r.recomputeDigest();
    return r;
  }

ensureRegistered

getSubset, 从IdentityHashMap<RelNode, RelSubset> mapRel2Subset中获取relnode对应的subset，这里是IdentityHM，用==比较key的对象地址，而不是用equal比value

  public RelSubset ensureRegistered(RelNode rel, RelNode equivRel) {
    RelSubset result;
    final RelSubset subset = getSubset(rel);
    if (subset != null) { //如果node有对应的subset，说明register过
      if (equivRel != null) {
        final RelSubset equivSubset = getSubset(equivRel); //如果有equivRel，获取相应的subset
        if (subset.set != equivSubset.set) {
          merge(equivSubset.set, subset.set); //合并到equivSubset的Relset中
        }
      }
      result = canonize(subset); //不断的取subset的relset的equivalentSet，取到leader的subset
    } else { //没有对应的subset，没有注册过，注册
      result = register(rel, equivRel);
    }

    return result;
  }

  public RelSubset register(
      RelNode rel,
      RelNode equivRel) {
    final RelSet set;
    if (equivRel == null) {
      set = null;
    } else { //如果有equivRel
      equivRel = ensureRegistered(equivRel, null); //保证equiv是注册过的
      set = getSet(equivRel); //
    }
    return registerImpl(rel, set); //register其实就调用registerImpl，只是需要加上equivSet
  }

5.2 一系列判断

a. Provenance（出处），用于标记 Where a RelNode came from，包含UnknownProvenance（不知道source），DirectProvenance（直接从其他node copy过来），RuleProvenance（通过触发rule生成）

记录下由那个Rule Call，产生这个RelNode；

ruleCallStack，用来记录最新触发的rulCall，如果有，说明当前新产生的RelNode应该是由该rule生成的

    if (ruleCallStack.isEmpty()) {
      provenanceMap.put(rel, Provenance.EMPTY); //Empty,即UnknownProvenance
    } else {
      final VolcanoRuleCall ruleCall = ruleCallStack.peek();
      provenanceMap.put(  //生成RelNode和Call的关联
          rel,
          new RuleProvenance(
              ruleCall.rule,
              ImmutableList.copyOf(ruleCall.rels),
              ruleCall.id));
    }

b. 判断是否存在equivExp的RelNode

    // If it is equivalent to an existing expression, return the set that
    // the equivalent expression belongs to.
    RelDigest digest = rel.getRelDigest();
    RelNode equivExp = mapDigestToRel.get(digest); //是否有相同digest的RelNode
    if (equivExp == null) {
      // do nothing
    } else if (equivExp == rel) { //如果有，且就是同一个RelNode对象，说明RelNode注册过，直接返回Subset
      return getSubset(rel);
    } else {
      checkPruned(equivExp, rel); //如果不是同一个对象，判断一下pruned，如果equiv已经被pruned，当前的也要pruned1
      RelSet equivSet = getSet(equivExp);
      if (equivSet != null) {
        return registerSubset(set, getSubset(equivExp)); //合并两个equivSet
      }
    }

c. converter判断

什么是converter？

首先是一种RelNode，这种node不会改变语义，只会改变物理属性，即trait；这里出于简单考虑，一个converter一次只能改变一个trait

注释里面说，Planer会把所有在逻辑上等价的但是具有不同的物理trait的RelNode都放到一个RelSet中

所以，我们上面和后面会看到经常会做merge RelSet的操作，因为只要是逻辑上等价都可以放到一个RelSet中

个人理解，上面converterRule往往会产生conventer进行真正的trait转换

/**
 * A relational expression implements the interface <code>Converter</code> to
 * indicate that it converts a physical attribute, or
 * {@link org.apache.calcite.plan.RelTrait trait}, of a relational expression
 * from one value to another.
 *
 * <p>Sometimes this conversion is expensive; for example, to convert a
 * non-distinct to a distinct object stream, we have to clone every object in
 * the input.</p>
 *
 * <p>A converter does not change the logical expression being evaluated; after
 * conversion, the number of rows and the values of those rows will still be the
 * same. By declaring itself to be a converter, a relational expression is
 * telling the planner about this equivalence, and the planner groups
 * expressions which are logically equivalent but have different physical traits
 * into groups called <code>RelSet</code>s.
 *
 * <p>In principle one could devise converters which change multiple traits
 * simultaneously (say change the sort-order and the physical location of a
 * relational expression). In which case, the method {@link #getInputTraits()}
 * would return a {@link org.apache.calcite.plan.RelTraitSet}. But for
 * simplicity, this class only allows one trait to be converted at a
 * time; all other traits are assumed to be preserved.</p>
 */
public interface Converter extends RelNode

    // Converters are in the same set as their children.
    if (rel instanceof Converter) {
      final RelNode input = ((Converter) rel).getInput();
      final RelSet childSet = getSet(input);
      if ((set != null)
          && (set != childSet)
          && (set.equivalentSet == null)) {
        merge(set, childSet); //用于converter不会改变数据，所以input的set需要和当前的合并，因为他们都是逻辑等价的

        // During the mergers, the child set may have changed, and since
        // we're not registered yet, we won't have been informed. So
        // check whether we are now equivalent to an existing
        // expression.
        if (fixUpInputs(rel)) {
          digest = rel.getRelDigest();
          RelNode equivRel = mapDigestToRel.get(digest);
          if ((equivRel != rel) && (equivRel != null)) {

            // make sure this bad rel didn't get into the
            // set in any way (fixupInputs will do this but it
            // doesn't know if it should so it does it anyway)
            set.obliterateRelNode(rel); //

            // There is already an equivalent expression. Use that
            // one, and forget about this one.
            return getSubset(equivRel); //
          }
        }
      } else {
        set = childSet; //如果set为空，直接用input的set即可，因为逻辑等价
      }
    }

5.3 创建RelSet

如果到这，RelSet还为空，说明这个等价集合第一次出现

创建新的RelSet，并加入到allSets中，allSets是List，仅仅用于debug

这里equivalentSet的设计，不知为何有这样的概念，对于RelSet既然equiv就能merge，保留这样的Set chain意义何在？

    // Place the expression in the appropriate equivalence set.
    if (set == null) {
      set = new RelSet( //
          nextSetId++,
          Util.minus(
              RelOptUtil.getVariablesSet(rel),
              rel.getVariablesSet()),
          RelOptUtil.getVariablesUsed(rel));
      this.allSets.add(set); //
    }

    // Chain to find 'live' equivalent set, just in case several sets are
    // merging at the same time.
    while (set.equivalentSet != null) { //如果equivalentSet存在，一直找到leader
      set = set.equivalentSet;
    }

5.4 registerClass

将Relnode的Class和traits注册到相应的机构中，记录planner包含何种RelNode和Traits

private final Set<Class<? extends RelNode>> classes = new HashSet<>();
//private final Set<RelTrait> traits = new HashSet<>();

private final Set<Convention> conventions = new HashSet<>();

这里老版本中是注册RelTrait，而新版中只是注册Convention，Convention只是一种traits

  public void registerClass(RelNode node) {
    final Class<? extends RelNode> clazz = node.getClass();
    if (classes.add(clazz)) {
      onNewClass(node);
    }
    if (conventions.add(node.getConvention())) {
      node.getConvention().register(this);
    }
  }

5.5 addRelToSet

RelSubset subset = addRelToSet(rel, set);

先调用add，将RelNode加入到subset或生产新的subset

  private RelSubset addRelToSet(RelNode rel, RelSet set) {
    RelSubset subset = set.add(rel);
    mapRel2Subset.put(rel, subset);

    // While a tree of RelNodes is being registered, sometimes nodes' costs
    // improve and the subset doesn't hear about it. You can end up with
    // a subset with a single rel of cost 99 which thinks its best cost is
    // 100. We think this happens because the back-links to parents are
    // not established. So, give the subset another change to figure out
    // its cost.
    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
    subset.propagateCostImprovements(this, mq, rel, new HashSet<>());

    return subset;
  }

主要就是注册RelNode所对应的RelSubset

注意这里是IdentityHashMap，所以比较的是RelNode的reference，而不是hashcode，不同的RelNode对象，就会对应各自不同的RelSubset

因为一个RelNode对象一定在一个RelSet中，但是不同的RelSet中可能包含相同RelNode对象实例，比如都有join对象

  /**
   * Map each registered expression ({@link RelNode}) to its equivalence set
   * ({@link RelSubset}).
   *
   * <p>We use an {@link IdentityHashMap} to simplify the process of merging
   * {@link RelSet} objects. Most {@link RelNode} objects are identified by
   * their digest, which involves the set that their child relational
   * expressions belong to. If those children belong to the same set, we have
   * to be careful, otherwise it gets incestuous.</p>
   */
  private final IdentityHashMap<RelNode, RelSubset> mapRel2Subset =
      new IdentityHashMap<>();

propagateCostImprovements，因为RelSet发生变化，可能产生新的best cost，所以把当前的change告诉其他的节点，更新cost，看看是否产生新的best cost

  void propagateCostImprovements(VolcanoPlanner planner, RelMetadataQuery mq,
      RelNode rel, Set<RelSubset> activeSet) {
    Queue<Pair<RelSubset, RelNode>> propagationQueue = new ArrayDeque<>();
    for (RelSubset subset : set.subsets) { //
      if (rel.getTraitSet().satisfies(subset.traitSet)) { //看看set下的所有subset哪些是和这个node的trait符合的，加入到PropagationQueue中
        propagationQueue.offer(Pair.of(subset, rel));
      }
    }

    while (!propagationQueue.isEmpty()) { //
      Pair<RelSubset, RelNode> p = propagationQueue.poll();
      p.left.propagateCostImprovements0(planner, mq, p.right, activeSet, propagationQueue); //
    }
  }

只要queue不空，不断的调用propagateCostImprovements0，传入activeSet避免circle，传入propagationQueue需要往上层propagation，要把上层符合相同trait的subset也放到queue中

   void propagateCostImprovements0(VolcanoPlanner planner, RelMetadataQuery mq,
      RelNode rel, Set<RelSubset> activeSet,
      Queue<Pair<RelSubset, RelNode>> propagationQueue) {

    if (!activeSet.add(this)) { //add过，重复即有环，
      LOGGER.trace("cyclic: {}", this);
      return;
    }
    try {
      RelOptCost cost = planner.getCost(rel, mq); //通Metadata计算该RelNode的Cost

      // Update subset best cost when we find a cheaper rel or the current
      // best's cost is changed
      if (cost.isLt(bestCost)) { //发现新的best cost
        bestCost = cost;
        best = rel;
        upperBound = bestCost;
        // since best was changed, cached metadata for this subset should be removed
        mq.clearCache(this);

        // Propagate cost change to parents
        for (RelNode parent : getParents()) { //向上传递
          // removes parent cached metadata since its input was changed
          mq.clearCache(parent);
          final RelSubset parentSubset = planner.getSubset(parent);

          // parent subset will clear its cache in propagateCostImprovements0 method itself
          for (RelSubset subset : parentSubset.set.subsets) {
            if (parent.getTraitSet().satisfies(subset.traitSet)) { //如果parent的Subset满足相同trait，也放到queue中
              propagationQueue.offer(Pair.of(subset, parent));
            }
          }
        }
      }
    } finally {
      activeSet.remove(this);
    }
  }

注册digest，如果已经注册过，直接返回

    final RelNode xx = mapDigestToRel.putIfAbsent(digest, rel);

    // This relational expression may have been registered while we
    // recursively registered its children. If this is the case, we're done.
    if (xx != null) {
      return subset;
    }

===============OLD========================================

5.5 importance

importance用于表示RelSubset的优先级，优先级越高，越先进行优化

在RuleQueue里面，用这个结构来保存各个subset的importance

  /**
   * The importance of each subset.
   */
  final Map<RelSubset, Double> subsetImportances = new HashMap<>();

importance的计算方法如下，

Computes the importance of a node. Importance is defined as follows:

the root RelSubset has an importance of 1

其实很简单，

比如root cost 3，两个child的cost，2，5；而root的importance为1

那么两个child的importance就是，0.2和0.5

所以越top的节点，importance越大，cost越大的节点，importance越大

==============================================================

5.6 fireRules

这里的fireRules，一般都是选择DeferringRuleCall，所以不是马上执行rule的，因为那样比较低效，而是等真正需要的时候才去执行

  void fireRules(RelNode rel) {
    for (RelOptRuleOperand operand : classOperands.get(rel.getClass())) {
      if (operand.matches(rel)) {
        final VolcanoRuleCall ruleCall;
        ruleCall = new DeferringRuleCall(this, operand);
        ruleCall.match(rel);
      }
    }
  }

classOperands里面保存，每个RelNode所匹配到的所有的RuleOperand

classOperands只是说明当前Operands和RelNode匹配，但是当前RelNode子树是否匹配Rule，需要进一步看

可以看到这里会Recurse的match，match的逻辑很长，这里就不看了

每匹配一次，solve+1，当solve == operands.size()，说明对整个Rule完成匹配

会调用onMatch

  /**
   * Applies this rule, with a given relational expression in the first slot.
   */
  void match(RelNode rel) {
    assert getOperand0().matches(rel) : "precondition";
    final int solve = 0;
    int operandOrdinal = getOperand0().solveOrder[solve];
    this.rels[operandOrdinal] = rel;
    matchRecurse(solve + 1);
  }

  /**
   * Recursively matches operands above a given solve order.
   *
   * @param solve Solve order of operand (&gt; 0 and &le; the operand count)
   */
  private void matchRecurse(int solve) {
    assert solve > 0;
    assert solve <= rule.operands.size();
    final List<RelOptRuleOperand> operands = getRule().operands;
    if (solve == operands.size()) {
      // We have matched all operands. Now ask the rule whether it
      // matches; this gives the rule chance to apply side-conditions.
      // If the side-conditions are satisfied, we have a match.
      if (getRule().matches(this)) {
        onMatch();
      }
    } else {......}}

对于DeferringRuleCall，

onMatch的逻辑，就是封装成VolcanoRuleMatch，并丢到RuleQueue里面去

并没有真正的执行Rule的onMatch，这就是Deferring

其实RuleQueue，RuleMatch, Importance 这些概念都是为了实现Deferring创造出来的，如果直接fire，机制就很简单

    /**
     * Rather than invoking the rule (as the base method does), creates a
     * {@link VolcanoRuleMatch} which can be invoked later.
     */
    protected void onMatch() {
      final VolcanoRuleMatch match =
          new VolcanoRuleMatch(
              volcanoPlanner,
              getOperand0(),
              rels,
              nodeInputs);
      volcanoPlanner.ruleDriver.getRuleQueue().addMatch(match);
    }
  }

6. findBestExp

新版本的findBestExp

  /**
   * Finds the most efficient expression to implement the query given via
   * {@link org.apache.calcite.plan.RelOptPlanner#setRoot(org.apache.calcite.rel.RelNode)}.
   *
   * @return the most efficient RelNode tree found for implementing the given
   * query
   */
  public RelNode findBestExp() {
    ensureRootConverters();
    registerMaterializations();

    ruleDriver.drive(); //

    RelNode cheapest = root.buildCheapestPlan(this); //

    return cheapest;
  }

6.1 RuleDriver

抽象出RuleDriver，这样可以简单的实现多种search算法

核心就是RuleQueue和drive

/**
 * A rule driver applies rules with designed algorithms.
 */
interface RuleDriver {

  /**
   * Gets the rule queue.
   */
  RuleQueue getRuleQueue();

  /**
   * Apply rules.
   */
  void drive();

  /**
   * Callback when new RelNodes are added into RelSet.
   * @param rel the new RelNode
   * @param subset subset to add
   */
  void onProduce(RelNode rel, RelSubset subset);

  /**
   * Callback when RelSets are merged.
   * @param set the merged result set
   */
  void onSetMerged(RelSet set);

  /**
   * Clear this RuleDriver.
   */
  void clear();
}

官方给出两种实现，

IterativeRuleDriver，这就是简单的反复执行

TopDownRuleDriver，TopDown的遍历RelNode，并触发相应的在Queue中的match

看下drive的实现，

逻辑就是往Tasks里面加入task，然后调用task.perfrom

tasks的实现是个stack，所以先放入的task会被后执行，注意了！

private Stack<Task> tasks = new Stack<>();

默认会加入OptimizeGroup

  @Override public void drive() {
    TaskDescriptor description = new TaskDescriptor();

    // Starting from the root's OptimizeGroup task.
    tasks.push(new OptimizeGroup(planner.root, planner.infCost));

    // ensure materialized view roots get explored.
    // Note that implementation rules or enforcement rules are not applied
    // unless the mv is matched
    exploreMaterializationRoots();

    try {
      // Iterates until the root is fully optimized
      while (!tasks.isEmpty()) {
        Task task = tasks.pop();
        description.log(task);
        task.perform(); //
      }
    } catch (VolcanoTimeoutException ex) {
      LOGGER.warn("Volcano planning times out, cancels the subsequent optimization.");
    }
  }

exploreMaterializationRoots，会加入OptimizeMExpr task

planner.explorationRoots意思是Extra roots for explorations

  private void exploreMaterializationRoots() {
    for (RelSubset extraRoot : planner.explorationRoots) {
      RelSet rootSet = VolcanoPlanner.equivRoot(extraRoot.set);
      if (rootSet == planner.root.set) {
        continue;
      }
      for (RelNode rel : extraRoot.set.rels) {
        if (planner.isLogical(rel)) {
          tasks.push(new OptimizeMExpr(rel, extraRoot, true));
        }
      }
    }
  }

OptimizeMExpr会先于OptimizeGroup被执行，下面再看这个task的实现

看下OptimizeGroup task的实现，task本身是接口，关键的函数是perform

看下OptimizeGroup的perform主要是生成3种task

  /**
   * Optimize a RelSubset.
   * It schedule optimization tasks for RelNodes in the RelSet.
   */
  private class OptimizeGroup implements Task {
    private final RelSubset group;
    private RelOptCost upperBound;

    @Override public void perform() {
      RelOptCost winner = group.getWinnerCost(); //判断taskState，如果是completed，返回best
      if (winner != null) {
        return;
      }

      if (group.taskState != null && upperBound.isLe(group.upperBound)) {
        // either this group failed to optimize before or it is a ring，优化过？
        return;
      }

      group.startOptimize(upperBound); //把state设置成optimizing

      // cannot decide an actual lower bound before MExpr are fully explored
      // so delay the lower bound checking

      // a gate keeper to update context
      tasks.push(new GroupOptimized(group)); //放入GroupOptimized任务，Mark the RelSubset optimized，最先加入，所以最后执行mark

      // optimize mExprs in group
      List<RelNode> physicals = new ArrayList<>();
      for (RelNode rel : group.set.rels) {
        if (planner.isLogical(rel)) {
          tasks.push(new OptimizeMExpr(rel, group, false)); //对于logicalNode，放入OptimizedMExpr任务
        } else if (rel.isEnforcer()) {
          // Enforcers have lower priority than other physical nodes
          physicals.add(0, rel);
        } else {
          physicals.add(rel);
        }
      }
      // always apply O_INPUTS first so as to get an valid upper bound
      for (RelNode rel : physicals) {
        Task task = getOptimizeInputTask(rel, group); //对于physicalNode放入OptimizeInput任务
        if (task != null) {
          tasks.add(task);
        }
      }
    }

分别看下这3种任务，

最先调用的是getOptimizeInputTask，决定如何优化物理节点

这里又有3个子任务，

  // Decide how to optimize a physical node.
  private Task getOptimizeInputTask(RelNode rel, RelSubset group) {
    boolean unProcess = false;
    for (RelNode input : rel.getInputs()) { //看下是否有input没有优化
      RelOptCost winner = ((RelSubset) input).getWinnerCost(); //通过看每个input的state
      if (winner == null) {
        unProcess = true;
        break;
      }
    }
    // If the inputs are all processed, only DeriveTrait is required.
    if (!unProcess) {
      return new DeriveTrait(rel, group); //如果全优化过了，加入DeriveTrait任务，Apply enforcing rules
    }
    // If part of the inputs are not optimized, schedule for the node an OptimizeInput task,
    // which tried to optimize the inputs first and derive traits for further execution.
    if (rel.getInputs().size() == 1) {
      return new OptimizeInput1(rel, group); //有一个没有优化
    }
    return new OptimizeInputs(rel, group); //多个没有优化
  }

在OptimizeInputs里面就不展开了，主要是加入这两个任务，这里已经递归调用到OptimizeGroup

tasks.push(new CheckInput(null, mExpr, input, 0, upperForInput));
tasks.push(new OptimizeGroup(input, upperForInput));

然后再调用OptimizeMExpr，这个应该是主要的task，

  /**
   * Optimize a logical node, including exploring its input and applying rules for it.
   */
  private class OptimizeMExpr implements Task {
    private final RelNode mExpr;
    private final RelSubset group;

    // when true, only apply transformation rules for mExpr
    private final boolean explore;

    OptimizeMExpr(RelNode mExpr,
        RelSubset group, boolean explore) {
      this.mExpr = mExpr;
      this.group = group;
      this.explore = explore;
    }

    @Override public void perform() {
      if (explore && group.isExplored()) {
        return;
      }
      // 1. explode input
      // 2. apply other rules
      tasks.push(new ApplyRules(mExpr, group, explore));
      for (int i = mExpr.getInputs().size() - 1; i >= 0; --i) {
        tasks.push(new ExploreInput(mExpr, i));
      }
    }

主要是加入两种task，

首先是对于input，加入ExploreInput，

主要逻辑，对inputs进行OptimizedMExpr，然后设置成explored

  /**
   * Explore an input for a RelNode.
   */
  private class ExploreInput implements Task {
    private final RelSubset group;
    private final RelNode parent;
    private final int inputOrdinal;

    @Override public void perform() {
      if (!group.explore()) { //如果已经explored，返回
        return;
      }
      tasks.push(new EnsureGroupExplored(group, parent, inputOrdinal)); //set explored
      for (RelNode rel : group.set.rels) {
        if (planner.isLogical(rel)) {
          tasks.push(new OptimizeMExpr(rel, group, true)); //
        }
      }
    }

然后是ApplyRules，

 /**
   * Extract rule matches from rule queue and add them to task stack.
   */
  private class ApplyRules implements Task {
    private final RelNode mExpr;
    private final RelSubset group;
    private final boolean exploring;

    @Override public void perform() {
      Pair<RelNode, Predicate<VolcanoRuleMatch>> category =
          exploring ? Pair.of(mExpr, planner::isTransformationRule)
              : Pair.of(mExpr, m -> true); //Predicate是谓词函数，单参数，boolean返回，exploring，这里只有TransformationRule为true，否则，一直为true
      VolcanoRuleMatch match = ruleQueue.popMatch(category); //
      while (match != null) {
        tasks.push(new ApplyRule(match, group, exploring)); //
        match = ruleQueue.popMatch(category); //
      }
    }

逻辑就是不断的从ruleQueue中popMatch，然后ApplyRule，直到pop不出为止

popMatch的实现，

  public VolcanoRuleMatch popMatch(Pair<RelNode, Predicate<VolcanoRuleMatch>> category) {
    List<VolcanoRuleMatch> queue = matches.get(category.left); //Map<RelNode, List<VolcanoRuleMatch>> matches，拿到RelNode相关的所有matches
    if (queue == null) {
      return null;
    }
    Iterator<VolcanoRuleMatch> iterator = queue.iterator();
    while (iterator.hasNext()) {
      VolcanoRuleMatch next = iterator.next();
      if (category.right != null && !category.right.test(next)) { //用predicate验证，不满足就continue
        continue;
      }
      iterator.remove(); //取出item
      if (!skipMatch(next)) { //如果这个match没有被pruned
        return next;
      }
    }
    return null;
  }

applyRule任务，

  /**
   * Apply a rule match.
   */
  private class ApplyRule implements GeneratorTask {
    private final VolcanoRuleMatch match;
    private final RelSubset group;
    private final boolean exploring;

    @Override public void perform() {
      applyGenerator(this, match::onMatch); //
    }

  private void applyGenerator(GeneratorTask task, Procedure proc) {
    GeneratorTask applying = this.applying;  //保存当前applying task
    this.applying = task; //
    try {
      proc.exec(); //执行proc
    } finally {
      this.applying = applying; //恢复applying
    }
  }

===============================旧版本的findBestExp============================================

  public RelNode findBestExp() {

    int cumulativeTicks = 0; //总步数，tick代表优化一次，触发一个RuleMatch
    //这个for只会执行一次，因为只有Optimize phase里面加了RuleMatch，其他都是空的
    //RuleQueue.addMatch中，phaseRuleSet != ALL_RULES 会过滤到其他的phase
    for (VolcanoPlannerPhase phase : VolcanoPlannerPhase.values()) {
      setInitialImportance(); //初始化impoartance

      RelOptCost targetCost = costFactory.makeHugeCost(); //目标cost，设为Huge
      int tick = 0; //如果for执行一次，等同于cumulativeTicks
      int firstFiniteTick = -1; //第一次找到可执行plan用的tick数
      int giveUpTick = Integer.MAX_VALUE; //放弃优化的tick数

      while (true) {
        ++tick;  //开始一次优化，tick+1
        ++cumulativeTicks;
        if (root.bestCost.isLe(targetCost)) { //如果bestcost < targetCost，说明找到可执行的计划
          if (firstFiniteTick < 0) { //如果是第一次找到
            firstFiniteTick = cumulativeTicks; //更新firstFiniteTick

            clearImportanceBoost(); //清除ImportanceBoost，RelSubset中有个field，boolean boosted，表示是否被boost
          }
          if (ambitious) {
            // 会试图找到更优的计划
            targetCost = root.bestCost.multiplyBy(0.9); //适当降低targetCost
 
            //如果impatient，需要设置giveUpTick
            //giveUpTick初始MAX_VALUE，当成功找到一个计划后，才会设置成相应的值
            if (impatient) { 
              if (firstFiniteTick < 10) { //如果第一次找到计划，步数小于10
                //下一轮如果25步找不到更优计划，放弃
                giveUpTick = cumulativeTicks + 25;
              } else {
                //如果计划比较复杂，步数放宽些
                giveUpTick =
                    cumulativeTicks
                        + Math.max(firstFiniteTick / 10, 25);
              }
            }
          } else {
            break; //非ambitious，有可用的计划就行，结束
          }
        } else if (cumulativeTicks > giveUpTick) { //放弃优化
          // We haven't made progress recently. Take the current best.
          break;
        } else if (root.bestCost.isInfinite() && ((tick % 10) == 0)) {
          //步数为整10，仍然没有找到可用的计划
          //bestCost的初始值就是Infinite
          injectImportanceBoost(); //提高某些RelSubSet的Importance，加快cost降低
        }

        VolcanoRuleMatch match = ruleQueue.popMatch(phase); //从RuleQueue中找到importance最大的Match
        if (match == null) {
          break;
        }
        match.onMatch(); //触发match

        // The root may have been merged with another
        // subset. Find the new root subset.
        root = canonize(root);
      }

      ruleQueue.phaseCompleted(phase);
    }

    RelNode cheapest = root.buildCheapestPlan(this);
    return cheapest;
  }

injectImportanceBoost

把仅仅包含Convention.NONE的RelSubSets的Importance提升，意思就让这些RelSubsets先被优化

  /**
   * Finds RelSubsets in the plan that contain only rels of
   * {@link Convention#NONE} and boosts their importance by 25%.
   */
  private void injectImportanceBoost() {
    final Set<RelSubset> requireBoost = new HashSet<>();

  SUBSET_LOOP:
    for (RelSubset subset : ruleQueue.subsetImportances.keySet()) {
      for (RelNode rel : subset.getRels()) {
        if (rel.getConvention() != Convention.NONE) {
          continue SUBSET_LOOP;
        }
      }

      requireBoost.add(subset);
    }

    ruleQueue.boostImportance(requireBoost, 1.25);
  }

Convention.NONE，都是infinite cost，所以先优化他们会更有效的降低cost

public interface Convention extends RelTrait {
  /**
   * Convention that for a relational expression that does not support any
   * convention. It is not implementable, and has to be transformed to
   * something else in order to be implemented.
   *
   * <p>Relational expressions generally start off in this form.</p>
   *
   * <p>Such expressions always have infinite cost.</p>
   */
  Convention NONE = new Impl("NONE", RelNode.class);

PopMatch

找出importance最大的match，并且返回

 /**
   * Removes the rule match with the highest importance, and returns it.
   *
   * <p>Returns {@code null} if there are no more matches.</p>
   *
   * <p>Note that the VolcanoPlanner may still decide to reject rule matches
   * which have become invalid, say if one of their operands belongs to an
   * obsolete set or has importance=0.
   *
   * @throws java.lang.AssertionError if this method is called with a phase
   *                              previously marked as completed via
   *                              {@link #phaseCompleted(VolcanoPlannerPhase)}.
   */
  VolcanoRuleMatch popMatch(VolcanoPlannerPhase phase) {
    PhaseMatchList phaseMatchList = matchListMap.get(phase);

    final List<VolcanoRuleMatch> matchList = phaseMatchList.list;
    VolcanoRuleMatch match;
    for (;;) {
      if (matchList.isEmpty()) {
        return null;
      }
      if (LOGGER.isTraceEnabled()) {
        //...
      } else {
        match = null;
        int bestPos = -1;
        int i = -1;
        //找出importance最大的match
        for (VolcanoRuleMatch match2 : matchList) {
          ++i;
          if (match == null
              || MATCH_COMPARATOR.compare(match2, match) < 0) {
            bestPos = i;
            match = match2;
          }
        }
        match = matchList.remove(bestPos);
      }

      if (skipMatch(match)) {
        LOGGER.debug("Skip match: {}", match);
      } else {
        break;
      }
    }

    // A rule match's digest is composed of the operand RelNodes' digests,
    // which may have changed if sets have merged since the rule match was
    // enqueued.
    match.recomputeDigest();
    phaseMatchList.matchMap.remove(
        planner.getSubset(match.rels[0]), match);

    return match;
  }

onMatch

   /**
   * Called when all operands have matched.
   */
  protected void onMatch() {
    volcanoPlanner.ruleCallStack.push(this);
    try {
    getRule().onMatch(this);
    } finally {
    volcanoPlanner.ruleCallStack.pop();
    }
  }

ruleCallStack记录当前在执行的RuleCall

最终调用到具体Rule的onMatch函数，做具体的转换

========================================================================================

6.2 buildCheapestPlan

  /**
   * Recursively builds a tree consisting of the cheapest plan at each node.
   */
  RelNode buildCheapestPlan(VolcanoPlanner planner) {
    CheapestPlanReplacer replacer = new CheapestPlanReplacer(planner);
    final RelNode cheapest = replacer.visit(this, -1, null);

    return cheapest;
  }

可以看到逻辑其实比较简单，就是遍历RelSubSet树，然后从上到下都选best的RelNode形成新的树

  /**
   * Visitor which walks over a tree of {@link RelSet}s, replacing each node
   * with the cheapest implementation of the expression.
   */
  static class CheapestPlanReplacer {
    VolcanoPlanner planner;

    CheapestPlanReplacer(VolcanoPlanner planner) {
      super();
      this.planner = planner;
    }

    public RelNode visit(
        RelNode p,
        int ordinal,
        RelNode parent) {
      if (p instanceof RelSubset) {
        RelSubset subset = (RelSubset) p;
        RelNode cheapest = subset.best; //取出SubSet中的best
        p = cheapest; //替换
      }

      List<RelNode> oldInputs = p.getInputs();
      List<RelNode> inputs = new ArrayList<>();
      for (int i = 0; i < oldInputs.size(); i++) {
        RelNode oldInput = oldInputs.get(i);
        RelNode input = visit(oldInput, i, p); //递归执行visit
        inputs.add(input); //新的input
      }
      if (!inputs.equals(oldInputs)) {
        final RelNode pOld = p;
        p = p.copy(p.getTraitSet(), inputs); //生成新的p
        planner.provenanceMap.put(
            p, new VolcanoPlanner.DirectProvenance(pOld));
      }
      return p;
    }
  }
}

BestCost是如何变化的？

每个RelSubSet都会记录，

bestCost和bestPlan

  /**
   * cost of best known plan (it may have improved since)
   */
  RelOptCost bestCost;

  /**
   * The set this subset belongs to.
   */
  final RelSet set;

  /**
   * best known plan
   */
  RelNode best;

初始化

首先RelSubSet初始化的时候，会执行computeBestCost

  private void computeBestCost(RelOptPlanner planner) {
    bestCost = planner.getCostFactory().makeInfiniteCost(); //bestCost初始化成，Double.POSITIVE_INFINITY
    final RelMetadataQuery mq = getCluster().getMetadataQuery();
    for (RelNode rel : getRels()) {
      final RelOptCost cost = planner.getCost(rel, mq);
      if (cost.isLt(bestCost)) {
        bestCost = cost;
        best = rel;
      }
    }
  }

getCost

  public RelOptCost getCost(RelNode rel, RelMetadataQuery mq) {
    if (rel instanceof RelSubset) {
      return ((RelSubset) rel).bestCost; //如果是RelSubSet直接返回结果，因为动态规划，重用之前的结果，不用反复算
    }
    if (noneConventionHasInfiniteCost //Convention.NONE的cost为InfiniteCost，返回
        && rel.getTraitSet().getTrait(ConventionTraitDef.INSTANCE) == Convention.NONE) {
      return costFactory.makeInfiniteCost();
    }
    RelOptCost cost = mq.getNonCumulativeCost(rel); //算cost
    if (!zeroCost.isLt(cost)) {
      // cost must be positive, so nudge it
      cost = costFactory.makeTinyCost(); //如果算出负的cost，用TinyCost替代，1.0
    }
    for (RelNode input : rel.getInputs()) {
      cost = cost.plus(getCost(input, mq)); //递归把整个数的cost都加到root
    }
    return cost;
  }

getNonCumulativeCost

    /**
     * Estimates the cost of executing a relational expression, not counting the
     * cost of its inputs. (However, the non-cumulative cost is still usually
     * dependent on the row counts of the inputs.) The default implementation
     * for this query asks the rel itself via {@link RelNode#computeSelfCost},
     * but metadata providers can override this with their own cost models.
     *
     * @return estimated cost, or null if no reliable estimate can be
     * determined
     */
    RelOptCost getNonCumulativeCost();

    /** Handler API. */
    interface Handler extends MetadataHandler<NonCumulativeCost> {
      RelOptCost getNonCumulativeCost(RelNode r, RelMetadataQuery mq);
    }

getNonCumulativeCost最终调用的是RelNode#computeSelfCost

这是个抽象接口，每个RelNode的实现不同，看下比较简单的Filter的实现，

  @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
      RelMetadataQuery mq) {
    double dRows = mq.getRowCount(this);
    double dCpu = mq.getRowCount(getInput());
    double dIo = 0;
    return planner.getCostFactory().makeCost(dRows, dCpu, dIo);
  }

这里的实现就是单纯用rowCount来表示cost

makeCost也是直接封装成VolcanoCost对象

节点变更

各个地方当产生新的RelNode时，会调用Register，ensureRegistered，或registerImpl进行注册

   public RelSubset register(
      RelNode rel,
      RelNode equivRel) {
    final RelSet set;
    if (equivRel == null) {
      set = null;
    } else {
      set = getSet(equivRel);
    }
    final RelSubset subset = registerImpl(rel, set);

    return subset;
  }

  public RelSubset ensureRegistered(RelNode rel, RelNode equivRel) {
    final RelSubset subset = getSubset(rel);
    if (subset != null) {
      if (equivRel != null) {
        final RelSubset equivSubset = getSubset(equivRel);
        if (subset.set != equivSubset.set) {
          merge(equivSubset.set, subset.set);
        }
      }
      return subset;
    } else {
      return register(rel, equivRel);
    }
  }

registerImpl调用addRelToSet，registerImpl的实现前面有

  private RelSubset addRelToSet(RelNode rel, RelSet set) {
    RelSubset subset = set.add(rel);
    mapRel2Subset.put(rel, subset);

    // While a tree of RelNodes is being registered, sometimes nodes' costs
    // improve and the subset doesn't hear about it. You can end up with
    // a subset with a single rel of cost 99 which thinks its best cost is
    // 100. We think this happens because the back-links to parents are
    // not established. So, give the subset another change to figure out
    // its cost.
    final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
    subset.propagateCostImprovements(this, mq, rel, new HashSet<>());

    return subset;
  }

propagateCostImprovements

  /**
   * Checks whether a relexp has made its subset cheaper, and if it so,
   * recursively checks whether that subset's parents have gotten cheaper.
   *
   * @param planner   Planner
   * @param mq        Metadata query
   * @param rel       Relational expression whose cost has improved
   * @param activeSet Set of active subsets, for cycle detection
   */
  void propagateCostImprovements(VolcanoPlanner planner, RelMetadataQuery mq,
      RelNode rel, Set<RelSubset> activeSet) {
    for (RelSubset subset : set.subsets) {
      if (rel.getTraitSet().satisfies(subset.traitSet)) {
        subset.propagateCostImprovements0(planner, mq, rel, activeSet);
      }
    }
  }

  void propagateCostImprovements0(VolcanoPlanner planner, RelMetadataQuery mq,
      RelNode rel, Set<RelSubset> activeSet) {
    ++timestamp;

    if (!activeSet.add(this)) { //检测到环
      // This subset is already in the chain being propagated to. This
      // means that the graph is cyclic, and therefore the cost of this
      // relational expression - not this subset - must be infinite.
      LOGGER.trace("cyclic: {}", this);
      return;
    }
    try {
      final RelOptCost cost = planner.getCost(rel, mq); //获取cost
      if (cost.isLt(bestCost)) {
        bestCost = cost;
        best = rel;

        // Lower cost means lower importance. Other nodes will change
        // too, but we'll get to them later.
        planner.ruleQueue.recompute(this); //cost变了，所以importance要重新算
        //递归的执行propagateCostImprovements
        for (RelNode parent : getParents()) {
          final RelSubset parentSubset = planner.getSubset(parent);
          parentSubset.propagateCostImprovements(planner, mq, parent,
              activeSet);
        }
        planner.checkForSatisfiedConverters(set, rel);
      }
    } finally {
      activeSet.remove(this);
    }
  }

posted on 2019-08-09 10:47 fxjwind 阅读(1865) 评论(0) 收藏举报

刷新页面返回顶部

fxjwind