COMP0005-Notes(6): Binary Search Trees (BST)

7 篇文章 0 订阅

Structure

Base case1: Null is a tree.
Base case2: A single node without any child is a tree.
Step case: A node with one or two children(tree) is a tree.
Constraint: Every node in a tree must be:

  1. Strictly larger than its left child, and
  2. Strictly smaller than its right child.

Associated Methods

Get(Search)

Pseudocode

The method can be defined as the recursive definition of the BST.

get(k, node){
	if (node == null){
		return null;
	}
	if (node.key == k){
		return node;
	}
	if (node.key > k){
		return get(k, node.left);
	}
	else{
		return get(k, node.right);
	}
}

Put(Insert)

Main Idea

We recursively(inductively) traverse down to an appropriate leaf node to add the new node. So the program always stops at a leaf node.

Pseudocode

put(newNode, node){
	if (node == null){
		return newNode;
	}
	if (node.key > newNode.key){
		node.left = put(newNode, node.left);
		return node;
	}
	else{
		node.right = put(newNode, node.right);
		return node;
	}
}

Implementation Explained

When it comes to operations that manipulate the tree nodes rather than simply retrieving them, for example get(), a common strategy is to obey the recursive nature of the tree by defining these methods in an recursive way as well. Not only that, after each recursive layer, we always return the entire sub-tree and connect it to the intended parent. For example, observe the following two lines:
node.left = put(newNode, node.left);
return node;
The point of the returning and the assigning makes sense when we finally found the place to change the tree(in this case, insert newNode), we can simply return newNode to assign to the pointer; But in these two lines, even if we are not finding the right place to insert, we still choose to return the node as plainly as it originally is.

The reason for this seemingly redundant operation gets revealed if you think of the return statement as returning a sub-tree instead of a node, just like how you would construct a tree recursively:

  • Think of the base case first: When we have found the correct place to insert the newNode, we return newNode as if we returned a one-element tree;

  • Then the step case comes: We take the parent and the inserted node together as a sub-tree to return to the next parent, which then forms another larger sub-tree;

  • This then goes on back until the root node, where the operation is initiated from.

For a straightforward operation like Insert, the benefit of having the recursive return definition is not very obvious, because you can simply use a procedural loop that leads you directly to your destination, i.e. the end leaf. The real meat is hidden at deleteMin() and more importantly at delete().

Floor

Like searching, what floor() and ceiling() are capable of is only retrieving data rather than manipulating the entire tree, so these operations exploit only the nature of order in the elements, rather than the recursive definition, so we won’t go into the details of pseudocode.

floor() is expected to return the largest element strictly smaller than the input key. This means that as we traverse down the tree, we keep track of the last element from which a right turn occurs.

Ceiling

ceil() is expected to return the smallest element strictly larger than the input key. This means that as we traverse down the tree, we keep track of the last element from which a left turn occurs.

Delete Minimum

Main Idea

Starting from a root node, we repeatedly move to the leftmost node to delete, and delegate its right child as replacement(null if it doesn’t have one).

Pseudocode

delMin(k, node){
	if (node.left == null){
		return node.right;
	}
	else{
		node.left = delMin(k, node.left);
		return node;
	}
}

Implementation Explained

Notice how similar the implementation of delMin() is to that of put()!
Again, try to think of this in a inductive way:

  • Base case: The node with no left node has been found, return the right child as the one-element tree;
  • Step case: The parent, together with the returned children (and untouched ones), form a sub-tree to be added to the main tree.

Again notice the fact that we used the term repeatedly in the Main Idea section. This implies that, again, since we have a fixed destination, this method can actually also be implemented in an procedural, looping way, but we still chose the recursive definition to prepare you for the coming final method:

Delete

Main Idea

Deleting a node is a little more complicated than the earlier methods. It’s simple if you want to delete a node which has no children: just return null; and also not very difficult if you delete a node which has one child: just return the child as the replacement; things become a little trickier when you want to delete a node which has two children: you then need to decide how to find a suitable replacement for the node.

Fortunately, the solution is not that complicated. There is one way to do this, and that exploits the nature of order in the elements: if you look at the right-hand sub-tree, it is evident that all the elements in that sub-tree are strictly larger than the elements in the left-hand sub-tree; hence, we only need to find the minimum in the right-hand sub-tree as the replacement, because it is strictly smaller than all the elements in the right-hand sub-tree (the point of being minimum), and strictly larger than all the elements in the left-hand sub-tree (it is from the right-hand sub-tree), which is all we want for an ideal parent node.

Note: Of course this would work just as well if you choose the maximum from the left-hand sub-tree, for simplicity we’ll use the one described here.

Hence, the most complicated case of this algorithm is divided into three steps:

  1. Find the minimum in the right-hand sub-tree. For simplicity we call it candidate node.
  2. Delete the minimum element in the right-hand sub-tree
  3. Replace all the link attached to the node to be deleted by attaching them to the candidate element.

Next we 'll see how this algorithm is best performed in an recursive way.

Pseudocode

del(k, node){
	// Step case - searching the node and build up a sub-tree
	if (node == null){ // node is present
		return null; // nothing to change so return the same null
	}
	if (node.key > k){
		node.left = del(k, node.left); // obtain the post-del subtree of node
		return node; // pack node and its subtrees together as a new subtree to the parent
	}
	if (node.key < k){
		node.right = del(k, node.right);
		return node;
	}
	// Base case - node is found
	else{
		// Base case 1 - no children
		if ((node.left == null) & (node.right == null)){
			return null; // node is directly deleted from tree
		}
		// Base case 2a - no left child
		if (node.left == null){
			return node.right; // delegate the right child as replacement
		}
		// Base case 2b - no right child
		if (node.right == null){
			return node.left;
		}
		// Base case 3 - two children
		else{
			temp = node; // keep a copy of all the links(sub-trees) to the node
			node = min(node.right); // first of all, this is an reuse of naming space
									// secondly, this is to maintain a copy of the candidate node before deleting it
			node.right = delMin(node.right); // attach the post-delMin right-hand subtree to the right-hand side of candidate node
			node.left = temp.left; // attach the untouched left-hand subtree to the left-hand side of the candidate node
			return node; // return the post-del sub-tree to its parent
		}
}

Implementation Explained

By now you should already be familiar with the recursive return and assignment. A slightly less intuitive statement is probably assigning delMin(node.right) to node.right. To understand this line, first understand that by calling delMin() on node.right, this effectively returns a whole sub-tree that is the result of deleting the minimum element from itself. Therefore, this assignment is linking the original right-hand sub-tree excluding the minimum element to the candidate node, establishing one of its two links.

Problems

An uncontrolled binary tree usually result in a ragged tree layout, so the performance of all methods in an average case is only going to be O(clogN) where c is some undetermined constant. But in the worst case, by inserting elements in an ascending/descending order, the unbalanced binary tree can degrade to a linear structure, thereby degrading all relevant methods to linear performance. Even by constantly switching the deletion operation between “min from the right” and “max from the left” approaches, the average case of deletion is only going to give O(sqrt(N)) performance, which is still an unaffordable performance when N gets large. Hence, we need a mechanism to control the organisation of a binary tree.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值