二叉搜索树定义
一颗二叉搜索树是以二叉树来组织的,每个节点除了 Key 还包括 左孩子, 右孩子, 父节点 等信息. BST满足限制条件: 对于任意节点的X,他的 左子树中关键字最大值<=X.key , 右子树关键字最小值>=X.key 这个关系表示如下
根据上图定义,一个二叉搜索树的例子是
二叉树操作
- 查询
- 插入
- 删除
查询(搜索)
二叉树搜索采用递归的方式来进行查询,根据二叉搜索树的定义: 左子树存储小值, 右子树存储大值,一个完整的二叉搜索示意图如下
可以写成 伪代码
TREE-SEARCH(x, k)
if x == NULL or k == x.key
return x
if k < x.key
return TREE-SEARCH(x.left)
if k > x.key
return TREE-SEARCH(x.right)
转换成python代码
def _get(self, key, node):
if node is None:
return None
if key < node.key:
return self._get(key, node.left)
elif key > node.key:
return self._get(key, node.right)
else:
return node.val
def get(self, key):
"""
Return the value paired with 'key'
Worst Case Complexity: O(N)
Balanced Tree Complexity: O(lg N)
"""
return self._get(key, self.root)
插入
插入和删除比查询呢稍微复杂一些,因为该操作会引起二叉搜索树的大小变化,会改变动态集合的结构.插入呢又比删除稍微容易实现.插入分为两部
- 查询插入节点
- 改变目标节点附近的数据结构
插入过程示意图如下
相应的伪代码如下, 输入节点 z , z.key = v, z.left = NULL, z.right = NULL.
TREE-INSERT(T, x)
y = NULL
x = T.root # 从根节点开始
while x != NULL
y = x # 保存上一节点
if z.key < x.key # 往左
x = x.left
else # 往右
x = x.right
z.p = y # 父节点
if y == NULL # tree T 为空
T.root = z
else if z.key < y.key
y.left = z
else y.right = z
程序的运行复杂度取决于二叉树的形状
插入的运行时间取决于二叉搜索树的高度h,程序的运行时间O(h) ,所以二叉树形状的好坏直接影响算法的运行时间.
python代码实现为
def _put(self, key, val, node):
# If we hit the end of a branch, create a new node
if node is None:
return Node(key, val)
# Follow left branch
if key < node.key:
node.left = self._put(key, val, node.left)
# Follow right branch
elif key > node.key:
node.right = self._put(key, val, node.right)
# Overwrite value
else:
node.val = val
node.size_of_subtree = self._size(node.left) + self._size(node.right)+1
return node
def put(self, key, val):
"""
Add a new key-value pair.
Worst Case Complexity: O(N)
Balanced Tree Complexity: O(lg N)
"""
self.root = self._put(key, val, self.root)
删除
删除总共分为三种情况:
- 如果删除节点x没有孩子,直接删除即可;
- 如果删除节点x有1个孩子,用孩子替换该节点位置;
如果删除节点x有2个孩子, 这个情况有些复杂.关键是要找到节点 x的继承者 . 节点z的继承者在节点z的右子树中有最小的关键值.这种情况下的操作分为下面步骤:
- 输入待删除的节点x 和 二叉搜索树T.
- 在节点x的右子树开始搜索:往右再往左找到最小值节点H;
- H右孩子为H的父节点, H的左孩子为X的左孩子;
示意图如下,应该一目了然:
根据上面的描述,删除的伪代码可以分为两部分:
为了移动子树, 用一棵子树替换一棵子树,并成为双亲的孩子节点.
TRANSPLANT(T, u, v) if u.p == NULL T.root = v else if u = u.p.left u.p.left = v else u.p.right = v if v!= NULL v.p = u.p
根据第一步完成二叉搜索树的删除过程:
TREE-DELETE(T, z) if z.left = NULL TRANSPLANT(T, z, z.right) else if (z.right == NULL) TRANSPLANT(T, z, z.left) else y = TREE-MINIMUM(z.right) if y.p != z TRANSPLANT(T, y, y.right) y.right = z.right y.right.p = y TRANSPLANT(T, z, y) y.left = z.left y.left.p = y
用python 实现如下:
def _delete(self, key, node): if node is None: return None if key < node.key: node.left = self._delete(key, node.left) elif key > node.key: node.right = self._delete(key, node.right) else: if node.right is None: return node.left elif node.left is None: return node.right else: old_node = node node = self._ceiling_node(key, node.right) node.right = self._delete_min(old_node.right) node.left = old_node.left node.size_of_subtree = self._size(node.left) + self._size(node.right)+1 return node def _delete_min(self, node): if node.left is None: return node.right node.left = self._delete_min(node.left) node.size_of_subtree = self._size(node.left) + self._size(node.right)+1 return node def _ceiling_node(self, key, node): """ Returns the node with the smallest key that is greater than or equal to the given value 'key' """ if node is None: return None if key < node.key: # Ceiling is either in left subtree or is this node attempt_in_left = self._ceiling_node(key, node.left) if attempt_in_left is None: return node else: return attempt_in_left elif key > node.key: # Ceiling must be in right subtree return self._ceiling_node(key, node.right) else: # Keys are equal so ceiling is node with this key return node
参考文献
- <<算法导论第三版>>
- http://algs4.cs.princeton.edu/32bst/