Find LCA in Binary Tree using RMQ
The article describes an approach to solving the problem of finding the LCA of two nodes in a tree by reducing it to a RMQ problem.
Lowest Common Ancestor (LCA) of two nodes u and v in a rooted tree T is defined as the node located farthest from the root that has both u and v as descendants.
For example, in below diagram, LCA of node 4 and node 9 is node 2.
There can be many approaches to solve the LCA problem. The approaches differ in their time and space complexities. Here is a link to a couple of them (these do not involve reduction to RMQ).
Range Minimum Query (RMQ) is used on arrays to find the position of an element with the minimum value between two specified indices. Different approaches for solving RMQ have been discussed here and here. In this article, Segment Tree based approach is discussed. With segment tree, preprocessing time is O(n) and time to for range minimum query is O(Logn). The extra space required is O(n) to store the segment tree.
Reduction of LCA to RMQ: The idea is to traverse the tree starting from root by an Euler tour (traversal without lifting pencil), which is a DFS-type traversal with preorder traversal characteristics.
Observation: The LCA of nodes 4 and 9 is node 2, which happens to be the node closest to the root amongst all those encountered between the visits of 4 and 9 during a DFS of T. This observation is the key to the reduction. Let’s rephrase: Our node is the node at the smallest level and the only node at that level amongst all the nodes that occur between consecutive occurrences (any) of u and v in the Euler tour of T.
We require three arrays for implementation:
- Nodes visited in order of Euler tour of T
- Level of each node visited in Euler tour of T
- Index of the first occurrence of a node in Euler tour of T (since any occurrence would be good, let’s track the first one)
Algorithm:
- Do a Euler tour on the tree, and fill the euler, level and first occurrence arrays.
- Using the first occurrence array, get the indices corresponding to the two nodes which will be the corners of the range in the level array that is fed to the RMQ algorithm for the minimum value.
- Once the algorithm return the index of the minimum level in the range, we use it to determine the LCA using Euler tour array.
Below is the implementation of above algorithm.
1 /* C++ Program to find LCA of u and v by reducing the problem to RMQ */ 2 #include<bits/stdc++.h> 3 #define V 9 // number of nodes in input tree 4 5 int euler[2*V - 1]; // For Euler tour sequence 6 int level[2*V - 1]; // Level of nodes in tour sequence 7 int firstOccurrence[V+1]; // First occurences of nodes in tour 8 int ind; // Variable to fill-in euler and level arrays 9 10 // A Binary Tree node 11 struct Node 12 { 13 int key; 14 struct Node *left, *right; 15 }; 16 17 // Utility function creates a new binary tree node with given key 18 Node * newNode(int k) 19 { 20 Node *temp = new Node; 21 temp->key = k; 22 temp->left = temp->right = NULL; 23 return temp; 24 } 25 26 // log base 2 of x 27 int Log2(int x) 28 { 29 int ans = 0 ; 30 while (x>>=1) ans++; 31 return ans ; 32 } 33 34 /* A recursive function to get the minimum value in a given range 35 of array indexes. The following are parameters for this function. 36 37 st --> Pointer to segment tree 38 index --> Index of current node in the segment tree. Initially 39 0 is passed as root is always at index 0 40 ss & se --> Starting and ending indexes of the segment represented 41 by current node, i.e., st[index] 42 qs & qe --> Starting and ending indexes of query range */ 43 int RMQUtil(int index, int ss, int se, int qs, int qe, int *st) 44 { 45 // If segment of this node is a part of given range, then return 46 // the min of the segment 47 if (qs <= ss && qe >= se) 48 return st[index]; 49 50 // If segment of this node is outside the given range 51 else if (se < qs || ss > qe) 52 return -1; 53 54 // If a part of this segment overlaps with the given range 55 int mid = (ss + se)/2; 56 57 int q1 = RMQUtil(2*index+1, ss, mid, qs, qe, st); 58 int q2 = RMQUtil(2*index+2, mid+1, se, qs, qe, st); 59 60 if (q1==-1) return q2; 61 62 else if (q2==-1) return q1; 63 64 return (level[q1] < level[q2]) ? q1 : q2; 65 } 66 67 // Return minimum of elements in range from index qs (quey start) to 68 // qe (query end). It mainly uses RMQUtil() 69 int RMQ(int *st, int n, int qs, int qe) 70 { 71 // Check for erroneous input values 72 if (qs < 0 || qe > n-1 || qs > qe) 73 { 74 printf("Invalid Input"); 75 return -1; 76 } 77 78 return RMQUtil(0, 0, n-1, qs, qe, st); 79 } 80 81 // A recursive function that constructs Segment Tree for array[ss..se]. 82 // si is index of current node in segment tree st 83 void constructSTUtil(int si, int ss, int se, int arr[], int *st) 84 { 85 // If there is one element in array, store it in current node of 86 // segment tree and return 87 if (ss == se)st[si] = ss; 88 89 else 90 { 91 // If there are more than one elements, then recur for left and 92 // right subtrees and store the minimum of two values in this node 93 int mid = (ss + se)/2; 94 constructSTUtil(si*2+1, ss, mid, arr, st); 95 constructSTUtil(si*2+2, mid+1, se, arr, st); 96 97 if (arr[st[2*si+1]] < arr[st[2*si+2]]) 98 st[si] = st[2*si+1]; 99 else 100 st[si] = st[2*si+2]; 101 } 102 } 103 104 /* Function to construct segment tree from given array. This function 105 allocates memory for segment tree and calls constructSTUtil() to 106 fill the allocated memory */ 107 int *constructST(int arr[], int n) 108 { 109 // Allocate memory for segment tree 110 111 // Height of segment tree 112 int x = Log2(n)+1; 113 114 // Maximum size of segment tree 115 int max_size = 2*(1<<x) - 1; // 2*pow(2,x) -1 116 117 int *st = new int[max_size]; 118 119 // Fill the allocated memory st 120 constructSTUtil(0, 0, n-1, arr, st); 121 122 // Return the constructed segment tree 123 return st; 124 } 125 126 // Recursive version of the Euler tour of T 127 void eulerTour(Node *root, int l) 128 { 129 /* if the passed node exists */ 130 if (root) 131 { 132 euler[ind] = root->key; // insert in euler array 133 level[ind] = l; // insert l in level array 134 ind++; // increment index 135 136 /* if unvisited, mark first occurrence */ 137 if (firstOccurrence[root->key] == -1) 138 firstOccurrence[root->key] = ind-1; 139 140 /* tour left subtree if exists, and remark euler 141 and level arrays for parent on return */ 142 if (root->left) 143 { 144 eulerTour(root->left, l+1); 145 euler[ind]=root->key; 146 level[ind] = l; 147 ind++; 148 } 149 150 /* tour right subtree if exists, and remark euler 151 and level arrays for parent on return */ 152 if (root->right) 153 { 154 eulerTour(root->right, l+1); 155 euler[ind]=root->key; 156 level[ind] = l; 157 ind++; 158 } 159 } 160 } 161 162 // Returns LCA of nodes n1, n2 (assuming they are 163 // present in the tree) 164 int findLCA(Node *root, int u, int v) 165 { 166 /* Mark all nodes unvisited. Note that the size of 167 firstOccurrence is 1 as node values which vary from 168 1 to 9 are used as indexes */ 169 memset(firstOccurrence, -1, sizeof(int)*(V+1)); 170 171 /* To start filling euler and level arrays from index 0 */ 172 ind = 0; 173 174 /* Start Euler tour with root node on level 0 */ 175 eulerTour(root, 0); 176 177 /* construct segment tree on level array */ 178 int *st = constructST(level, 2*V-1); 179 180 /* If v before u in Euler tour. For RMQ to work, first 181 parameter 'u' must be smaller than second 'v' */ 182 if (firstOccurrence[u]>firstOccurrence[v]) 183 std::swap(u, v); 184 185 // Starting and ending indexes of query range 186 int qs = firstOccurrence[u]; 187 int qe = firstOccurrence[v]; 188 189 // query for index of LCA in tour 190 int index = RMQ(st, 2*V-1, qs, qe); 191 192 /* return LCA node */ 193 return euler[index]; 194 } 195 196 // Driver program to test above functions 197 int main() 198 { 199 // Let us create the Binary Tree as shown in the diagram. 200 Node * root = newNode(1); 201 root->left = newNode(2); 202 root->right = newNode(3); 203 root->left->left = newNode(4); 204 root->left->right = newNode(5); 205 root->right->left = newNode(6); 206 root->right->right = newNode(7); 207 root->left->right->left = newNode(8); 208 root->left->right->right = newNode(9); 209 210 int u = 4, v = 9; 211 printf("The LCA of node %d and node %d is node %d.\n", 212 u, v, findLCA(root, u, v)); 213 return 0; 214 }
Output:
The LCA of node 4 and node 9 is node 2.
Note:
- We assume that the nodes queried are present in the tree.
- We also assumed that if there are V nodes in tree, then keys (or data) of these nodes are in range from 1 to V.
Time complexity:
- Euler tour: Number of nodes is V. For a tree, E = V-1. Euler tour (DFS) will take O(V+E) which is O(2*V) which can be written as O(V).
- Segment Tree construction : O(n) where n = V + E = 2*V – 1.
- Range Minimum query: O(log(n))
Overall this method takes O(n) time for preprocssing, but takes O(Log n) time for query. Therefore, it can be useful when we have a single tree on which we want to perform large number of LCA queries (Note that LCA is useful for finding shortest path between two nodes of Binary Tree)
Auxiliary Space:
- Euler tour array: O(n) where n = 2*V – 1
- Node Levels array: O(n)
- First Occurrences array: O(V)
- Segment Tree: O(n)
Overall: O(n)
Another observation is that the adjacent elements in level array differ by 1. This can be used to convert a RMQ problem to a LCA problem.