Home

[This page is auto-generated; please prefer to read the article’s PDF Form.]



Successor in Binary Search Trees

Nitin Verma

October 22, 2020

A Binary Search Tree (BST) is structurally same as a Binary Tree, but its nodes’ keys follow certain property. When we define the successor of a node in BST, we can only depend upon its Binary Tree structure, not its keys. In a BST T, the successor of a node x can be defined to be the node which is processed just after x during Inorder Tree Walk of T (Chapter 12 in [1]). If x is the last processed node, then its successor is null. In this article, we will understand the method to locate successor of x starting with x.



Figure 1: Inorder Tree Walk

PICT


Figure 1 shows a Binary Tree with its nodes labeled according to their order of processing during Inorder Tree Walk. Observe where the first (1) and last processed (10) nodes are located in the tree. And how a node and its successor are located relative to each other, specially node 8’s successor 9, and node 4’s successor 5.

The recursive method for Inorder Tree Walk closely resembles the very definition of Inorder Tree Walk, and can be written in C language as:

void inorder(node *x)  
{  
  if(x->left)     inorder(x->left);   /* step (1) */  
 
  process(x);                         /* step (2) */  
 
  if(x->right)    inorder(x->right);  /* step (3) */  
}

node represents a structure for tree nodes, its left/right members store the left/right child’s pointers. The initial call has T’s root as x.

We will try to relate the locations of a node and its successor just by understanding the above method’s structure. First, we can conclude the following about the first and last processed nodes during inorder(x), where x is any node in T:

  1. Say f is the first node processed. Note that, at any point, all calls on the callstack except the topmost (latest) one, must be at their recursive step: step (1) or (3). Since inorder(f) must be the first call which reaches step (2), its step (1) must not have recursed. Also, no invocation of this method has yet reached step (3) which can happen only after step (2). So, all calls starting at inorder(x) till the caller of inorder(f), have recursed only via step (1), by accessing the left pointer. That is, path from x to f consists of 0 or more left pointers, and f left must be null.
  2. Say l is the last node processed. Since inorder(l) must be the last one which reaches step (2), no other call on the callstack must have recursed via step (1) as that is always followed by step (2) of processing the node. It means, they must have recursed via step (3), by accessing the right pointer. Further, step (3) in inorder(l) must not result in the recursive call. That is, path from x to l consists of 0 or more right pointers, and l right must be null.

We now come to the problem of finding successor s of x. Recall that s is the node processed just after x during inorder walk of T. During this tree walk, consider the invocation of inorder(x), where x is processed in step (2). Now, if x has the right-child xR, then immediately after processing x, step (3) will do inorder(xR). Since any invocation of inorder() processes at least one node, so will that of inorder(xR). The very first node processed during this will be the successor s of x. By applying conclusion (1) to inorder(xR), we can say that the path from xR to s consists of 0 or more left pointers, and s left must be null. That is, path from x to s consists of a right pointer, then 0 or more left pointers, such that s left is null. In Figure 1, node 8’s successor 9 is such an example.

But if x has no right-child, the call to inorder(x) will return just after processing x, without processing any other node. Consider all other calls of inorder() on the callstack, during inorder(x). All these calls are for nodes ancestors to x; from root node till parent of x. Call the closest ancestor of x (its parent) as a1, a1’s parent as a2, a2’s parent as a3 etc. So, inorder(x) was called by inorder(a1). When inorder(x) returns, inorder(a1) must have completed either step (1) or (3). If it is step (1), then next processed node is a1, making it the successor s. If it is step (3), inorder(a1) would return to its caller, inorder(a2). Again, inorder(a2) must have completed either step (1) or (3), and similar argument applies. Thus, the successor s of x is its closest ancestor such that inorder(s) is at step (1) during inorder(x).

All ancestors closer than s are at step (3) and will simply return once x is processed. And then, inorder(s) will process node s via its step (2). Say, left-child of s is sL. x is the last node processed during step (1) of inorder(s), i.e. inorder(sL). Hence, if x has no right-child, the path from s to x consists of a left pointer (accessed in step (1)), followed by 0 or more right pointers (accessed in step (3)). In Figure 1, node 4’s successor 5 is such an example.

It is also possible that no ancestor of x is at step (1) during inorder(x). Then, the path from root to x consists of 0 or more right pointers. By applying conclusion (2) to inorder(root), x is the last processed node of T with no successor.

To summarize:

  1. If x has the right-child xR, then s is the first node processed during inorder(xR).
  2. Otherwise, s is its closest ancestor whose left-child sL appears in path from s to x. x is the last node processed during inorder(sL).

[Next]