Converting Recursive Traversal to Iterator

In this article, I'm going to introduce a general pattern named Lazy Iterator for converting recursive traversal to iterator. Let's start with a familiar example of inorder traversal of binary tree.

tree

It is straightforward to write a recursive function which returns the nodes as List:

// Java
List<Node> traverseInOrder(Node node) {
  if (node == null) {
    return emptyList();
  }
  Return concat(traverseInOrder(node.left), List.of(node), traverseInOrder(node.right));
}

List<T> concat(List<T>... lists) {
  ... // List concatenation.
}

Of course, we can simply return the iterator of the result list, but the drawback is twofold: 1) it is not lazily evaluated, which means even if we only care about the first a few items, we must traverse the whole tree. 2) the space complexity is O(N), which is not really necessary. We'd like to have an iterator which is lazily evaluated and takes memory only as needed. The function signature is given as below:

// Java
Iterator<Node> traverseInOrder(Node node) {
  // TODO
}

One idea most people could easily come up with (but not necessarily be able to correctly implement) is to use a stack to keep track of the state of traversal. That works, but it is relatively complex and not general, there is a neat, simple and general way. This is the code which demonstrates the pattern:

// Java
Iterator<Node> traverseInOrder(Node node) {
  if (node == null) {
    return emtpyIterator();
  }
  Supplier<Iterator<Node>> leftIterator = () -> traverseInOrder(node.left);
  Supplier<Iterator<Node>> currentIterator = () -> singleNodeIterator(node);
  Supplier<Iterator<Node>> rightIterator = () -> traverseInOrder(node.right);
  return concat(leftIterator, currentIterator, rightIterator);
}

Iterator<T> concat(Supplier<Iterator<T>>... iteratorSupplier) {
  // TODO
}

If you are not familiar with Java, Supplier<T> is a function which takes no argument and returns a value of type T, so basically you can think of it as a lazily evaluated T.

Note the structural correspondence between the code above and the original recursive traversal function. Instead of traverse the left and right branches before returning, it creates a lambda expression for each which when evaluated will return an iterator. So the iterators for the subproblems are lazily created.

Finally and the most importantly, it needs the help of a general function concat which creates an iterator backed by multiple iterator suppliers. The type of this function is (2 suppliers):

Supplier<Iterator<T>> -> Supplier<Iterator<T>> -> Iterator<T>

The key is that this function must keep things lazy as well, evaluate only when necessary. Here is how I implement it:

// Java
Iterator<T> concat(Supplier<Iterator<T>>... iteratorSuppliers) {
  return new LazyIterator(iteratorSuppliers);
}

class LazyIterator<T> {
  private final Supplier<Iterator<T>>[] iteratorSuppliers;
  private int i = 0;
  private Iterator<T> currentIterator = null;

  public LazyIterator(Supplier<Iterator<T>>[] iteratorSuppliers) {
    this.iteratorSuppliers = iteratorSuppliers;
  }

  // When returning true, hasNext() always sets currentIterator to the correct iterator
  // which has more elements.
  @Override  
  public boolean hasNext() {
    while (i < iteratorSuppliers.length) {
      if (currentIterator  == null) {
        currentIterator = iteratorSuppliers[i].apply();
      }
      if (currentIterator.hasNext()) {
        return true;
      }
      currentIterator = null;
      i++;
    }
    return false;
  }

  @Override
  public T next() {
    return currentIterator.next(); 
  }
}

The LazyIterator class works only on Iterator<T>, which means it is orthogonal to specific recursive functions, hence it serves as a general way of converting recursive traversal into iterator.

If you are familiar with languages with generator, e.g., Python, the idiomatic way of implementing iterator for recursive traversal is like this:

# Python
def traverse_inorder(node):
  if node.left:
    for child in traverse_inorder(node.left):
      yield child
  yield node
  if node.right:
    for child in traverse_inorder(node.right):
      yield child

yield will save the current stack state for the generator function, and resume the computation later when being called again. You can think of the lazy iterator pattern introduced in this article as a way of capturing the computation which can be resumed, hence simulate the generator feature in other languages.

 

posted on 2017-11-13 12:31 Todd Wei 阅读(...) 评论(...) 编辑 收藏

统计