红黑树

用途和好处[编辑]

红黑树和AVL树一样都对插入时间、删除时间和查找时间提供了最好可能的最坏情况担保。这不只是使它们在时间敏感的应用如实时应用(real time application)中有价值,而且使它们有在提供最坏情况担保的其他数据结构中作为建造板块的价值;例如,在计算几何中使用的很多数据结构都可以基于红黑树。

红黑树在函数式编程中也特别有用,在这里它们是最常用的持久数据结构(persistent data structure)之一,它们用来构造关联数组集合,每次插入、删除之后它们能保持为以前的版本。除了O(log n)的时间之外,红黑树的持久版本对每次插入或删除需要O(log n)的空间。

红黑树是 2-3-4树的一种等同。换句话说,对于每个 2-3-4 树,都存在至少一个数据元素是同样次序的红黑树。在 2-3-4 树上的插入和删除操作也等同于在红黑树中颜色翻转和旋转。这使得 2-3-4 树成为理解红黑树背后的逻辑的重要工具,这也是很多介绍算法的教科书在红黑树之前介绍 2-3-4 树的原因,尽管 2-3-4 树在实践中不经常使用。

性质[编辑]

红黑树是每个节点都带有颜色属性的二叉查找树,颜色为红色黑色。在二叉查找树强制一般要求以外,对于任何有效的红黑树我们增加了如下的额外要求:

性质1. 节点是红色或黑色。

性质2. 根是黑色。

性质3. 所有叶子都是黑色(叶子是NIL节点)。

性质4. 每个红色节点必须有两个黑色的子节点。(从每个叶子到根的所有路径上不能有两个连续的红色节点。)

性质5. 从任一节点到其每个叶子的所有简单路径都包含相同数目的黑色节点。

下面是一个具体的红黑树的图例:

An example of a red-black tree

这些约束确保了红黑树的关键特性: 从根到叶子的最长的可能路径不多于最短的可能路径的两倍长。结果是这个树大致上是平衡的。因为操作比如插入、删除和查找某个值的最坏情况时间都要求与树的高度成比例,这个在高度上的理论上限允许红黑树在最坏情况下都是高效的,而不同于普通的二叉查找树

要知道为什么这些性质确保了这个结果,注意到性质4导致了路径不能有两个毗连的红色节点就足够了。最短的可能路径都是黑色节点,最长的可能路径有交替的红色和黑色节点。因为根据性质5所有最长的路径都有相同数目的黑色节点,这就表明了没有路径能多于任何其他路径的两倍长。



Read-only operations on a red–black tree require no modification from those used for binary search trees, because every red–black tree is a special case of a simple binary search tree. However, the immediate result of an insertion or removal may violate the properties of a red–black tree. Restoring the red–black properties requires a small number (O(log n) or amortized O(1)) of color changes (which are very quick in practice) and no more than three tree rotations (two for insertion). Although insert and delete operations are complicated, their times remain O(log n).

Insertion[edit]

Insertion begins by adding the node as any binary search tree insertion does and by coloring it red. Whereas in the binary search tree, we always add a leaf, in the red–black tree, leaves contain no information, so instead we add a red interior node, with two black leaves, in place of an existing black leaf.

What happens next depends on the color of other nearby nodes. The term uncle node will be used to refer to the sibling of a node's parent, as in human family trees. Note that:

  • property 3 (all leaves are black) always holds.
  • property 4 (both children of every red node are black) is threatened only by adding a red node, repainting a black node red, or a rotation.
  • property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is threatened only by adding a black node, repainting a red node black (or vice versa), or a rotation.
Notes
  1. The label N will be used to denote the current node (colored red). In the diagrams N carries a blue contour. At the beginning, this is the new node being inserted, but the entire procedure may also be applied recursively to other nodes (see case 3). P will denote N's parent node, G will denote N's grandparent, and U will denote N's uncle. In between some cases, the roles and labels of the nodes are exchanged, but in each case, every label continues to represent the same node it represented at the beginning of the case.
  2. If a node in the right (target) half of a diagram carries a blue contour it will become the current node in the next iteration and there the other nodes will be newly assigned relative to it. Any color shown in the diagram is either assumed in its case or implied by those assumptions.
  3. A numbered triangle represents a subtree of unspecified depth. A black circle atop a triangle means that subtree has one more black node than a triangle without this circle..

There are several cases of red–black tree insertion to handle:

  • N is the root node, i.e., first node of red–black tree
  • N's parent (P) is black
  • N's parent (P) and uncle (U) are red
  • N is added to right of left child of grandparent, or N is added to left of right child of grandparent (P is red and U is black)
  • N is added to left of left child of grandparent, or N is added to right of right child of grandparent (P is red and U is black)

Each case will be demonstrated with example C code. The uncle and grandparent nodes can be found by these functions:

struct node *grandparent(struct node *n)
{
 if ((n != NULL) && (n->parent != NULL))
  return n->parent->parent;
 else
  return NULL;
}

struct node *uncle(struct node *n)
{
 struct node *g = grandparent(n);
 if (g == NULL)
  return NULL; // No grandparent means no uncle
 if (n->parent == g->left)
  return g->right;
 else
  return g->left;
}

Case 1: The current node N is at the root of the tree. In this case, it is repainted black to satisfy property 2 (the root is black). Since this adds one black node to every path at once, property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is not violated.

void insert_case1(struct node *n)
{
 if (n->parent == NULL)
  n->color = BLACK;
 else
  insert_case2(n);
}

Case 2: The current node's parent P is black, so property 4 (both children of every red node are black) is not invalidated. In this case, the tree is still valid. Property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is not threatened, because the current node N has two black leaf children, but because N is red, the paths through each of its children have the same number of black nodes as the path through the leaf it replaced, which was black, and so this property remains satisfied.

void insert_case2(struct node *n)
{
 if (n->parent->color == BLACK)
  return; /* Tree is still valid */
 else
  insert_case3(n);
}
Note: In the following cases it can be assumed that  N has a grandparent node  G, because its parent  P is red, and if it were the root, it would be black. Thus,  N also has an uncle node  U, although it may be a leaf in cases 4 and 5.
Diagram of case 3

Case 3: If both the parent P and the uncle U are red, then both of them can be repainted black and the grandparent G becomes red (to maintain property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes)). Now, the current red node N has a black parent. Since any path through the parent or uncle must pass through the grandparent, the number of black nodes on these paths has not changed. However, the grandparent G may now violate properties 2 (The root is black) or 4 (Both children of every red node are black) (property 4 possibly being violated since G may have a red parent). To fix this, the entire procedure is recursively performed on G from case 1. Note that this is a tail-recursive call, so it could be rewritten as a loop; since this is the only loop, and any rotations occur after this loop, this proves that a constant number of rotations occur.

void insert_case3(struct node *n)
{
 struct node *u = uncle(n), *g;

 if ((u != NULL) && (u->color == RED)) {
  n->parent->color = BLACK;
  u->color = BLACK;
  g = grandparent(n);
  g->color = RED;
  insert_case1(g);
 } else {
  insert_case4(n);
 }
}
Note: In the remaining cases, it is assumed that the parent node  P is the left child of its parent. If it is the right child,  left and  right should be reversed throughout cases 4 and 5. The code samples take care of this.
Diagram of case 4

Case 4: The parent P is red but the uncle U is black; also, the current node N is the right child of P, and P in turn is the left child of its parent G. In this case, a left rotation on P that switches the roles of the current node N and its parent P can be performed; then, the former parent node P is dealt with using case 5 (relabeling N and P) because property 4 (both children of every red node are black) is still violated. The rotation causes some paths (those in the sub-tree labelled "1") to pass through the node N where they did not before. It also causes some paths (those in the sub-tree labelled "3") not to pass through the node P where they did before. However, both of these nodes are red, so property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is not violated by the rotation. After this case has been completed, property 4 (both children of every red node are black) is still violated, but now we can resolve this by continuing to case 5.

void insert_case4(struct node *n)
{
 struct node *g = grandparent(n);

 if ((n == n->parent->right) && (n->parent == g->left)) {
  rotate_left(n->parent);

 /*
 * rotate_left can be the below because of already having *g =  grandparent(n) 
 *
 * struct node *saved_p=g->left, *saved_left_n=n->left;
 * g->left=n; 
 * n->left=saved_p;
 * saved_p->right=saved_left_n;
 * 
 * and modify the parent's nodes properly
 */

  n = n->left;

 } else if ((n == n->parent->left) && (n->parent == g->right)) {
  rotate_right(n->parent);

 /*
 * rotate_right can be the below to take advantage of already having *g =  grandparent(n) 
 *
 * struct node *saved_p=g->right, *saved_right_n=n->right;
 * g->right=n; 
 * n->right=saved_p;
 * saved_p->left=saved_right_n;
 * 
 */

  n = n->right; 
 }
 insert_case5(n);
}
Diagram of case 5

Case 5: The parent P is red but the uncle U is black, the current node N is the left child of P, and P is the left child of its parent G. In this case, a right rotation on G is performed; the result is a tree where the former parent P is now the parent of both the current node N and the former grandparent GG is known to be black, since its former child P could not have been red otherwise (without violating property 4). Then, the colors of P and G are switched, and the resulting tree satisfies property 4 (both children of every red node are black). Property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) also remains satisfied, since all paths that went through any of these three nodes went through G before, and now they all go through P. In each case, this is the only black node of the three.

void insert_case5(struct node *n)
{
 struct node *g = grandparent(n);

 n->parent->color = BLACK;
 g->color = RED;
 if (n == n->parent->left)
  rotate_right(g);
 else
  rotate_left(g);
}

Note that inserting is actually in-place, since all the calls above use tail recursion.

Removal[edit]

In a regular binary search tree when deleting a node with two non-leaf children, we find either the maximum element in its left subtree (which is the in-order predecessor) or the minimum element in its right subtree (which is the in-order successor) and move its value into the node being deleted (as shown here). We then delete the node we copied the value from, which must have fewer than two non-leaf children. (Non-leaf children, rather than all children, are specified here because unlike normal binary search trees, red–black trees can have leaf nodes anywhere, so that all nodes are either internal nodes with two children or leaf nodes with, by definition, zero children. In effect, internal nodes having two leaf children in a red–black tree are like the leaf nodes in a regular binary search tree.) Because merely copying a value does not violate any red–black properties, this reduces to the problem of deleting a node with at most one non-leaf child. Once we have solved that problem, the solution applies equally to the case where the node we originally want to delete has at most one non-leaf child as to the case just considered where it has two non-leaf children.

Therefore, for the remainder of this discussion we address the deletion of a node with at most one non-leaf child. We use the label M to denote the node to be deleted;C will denote a selected child of M, which we will also call "its child". If M does have a non-leaf child, call that its child, C; otherwise, choose either leaf as its child, C.

If M is a red node, we simply replace it with its child C, which must be black by property 4. (This can only occur when M has two leaf children, because if the red node Mhad a black non-leaf child on one side but just a leaf child on the other side, then the count of black nodes on both sides would be different, thus the tree would violate property 5.) All paths through the deleted node will simply pass through one fewer red node, and both the deleted node's parent and child must be black, so property 3 (all leaves are black) and property 4 (both children of every red node are black) still hold.

Another simple case is when M is black and C is red. Simply removing a black node could break Properties 4 (“Both children of every red node are black”) and 5 (“All paths from any given node to its leaf nodes contain the same number of black nodes”), but if we repaint C black, both of these properties are preserved.

The complex case is when both M and C are black. (This can only occur when deleting a black node which has two leaf children, because if the black node M had a black non-leaf child on one side but just a leaf child on the other side, then the count of black nodes on both sides would be different, thus the tree would have been an invalid red–black tree by violation of property 5.) We begin by replacing M with its child C. We will relabel this child C (in its new position) N, and its sibling (its new parent's other child) S. (S was previously the sibling of M.) In the diagrams below, we will also use P for N's new parent (M's old parent), SL for S's left child, and SR forS's right child (S cannot be a leaf because if M and C were black, then P's one subtree which included M counted two black-height and thus P's other subtree which includes S must also count two black-height, which cannot be the case if S is a leaf node).

Notes
  1. The label N will be used to denote the current node (colored black). In the diagrams N carries a blue contour. At the beginning, this is the replacement node and a leaf, but the entire procedure may also be applied recursively to other nodes (see case 3). In between some cases, the roles and labels of the nodes are exchanged, but in each case, every label continues to represent the same node it represented at the beginning of the case.
  2. If a node in the right (target) half of a diagram carries a blue contour it will become the current node in the next iteration and there the other nodes will be newly assigned relative to it. Any color shown in the diagram is either assumed in its case or implied by those assumptions. White represents an arbitrary color (either red or black), but the same in both halves of the diagram.
  3. A numbered triangle represents a subtree of unspecified depth. A black circle atop a triangle means that subtree has one more black node than a triangle without this circle.

We will find the sibling using this function:

struct node *sibling(struct node *n)
{
 if (n == n->parent->left)
  return n->parent->right;
 else
  return n->parent->left;
}
Note: In order that the tree remains well-defined, we need that every null leaf remains a leaf after all transformations (that it will not have any children). If the node we are deleting has a non-leaf (non-null) child  N, it is easy to see that the property is satisfied. If, on the other hand,  N would be a null leaf, it can be verified from the diagrams (or code) for all the cases that the property is satisfied as well.

We can perform the steps outlined above with the following code, where the function replace_node substitutes child into n's place in the tree. For convenience, code in this section will assume that null leaves are represented by actual node objects rather than NULL (the code in the Insertion section works with either representation).

void delete_one_child(struct node *n)
{
 /*
  * Precondition: n has at most one non-null child.
  */
 struct node *child = is_leaf(n->right) ? n->left : n->right;

 replace_node(n, child);
 if (n->color == BLACK) {
  if (child->color == RED)
   child->color = BLACK;
  else
   delete_case1(child);
 }
 free(n);
}
Note: If  N is a null leaf and we do not want to represent null leaves as actual node objects, we can modify the algorithm by first calling delete_case1() on its parent (the node that we delete,  n in the code above) and deleting it afterwards. We can do this because the parent is black, so it behaves in the same way as a null leaf (and is sometimes called a 'phantom' leaf). And we can safely delete it at the end as  n will remain a leaf after all operations, as shown above.

If both N and its original parent are black, then deleting this original parent causes paths which proceed through N to have one fewer black node than paths that do not. As this violates property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes), the tree must be rebalanced. There are several cases to consider:

Case 1: N is the new root. In this case, we are done. We removed one black node from every path, and the new root is black, so the properties are preserved.

void delete_case1(struct node *n)
{
 if (n->parent != NULL)
  delete_case2(n);
}
Note: In cases 2, 5, and 6, we assume  N is the left child of its parent  P. If it is the right child,  left and  right should be reversed throughout these three cases. Again, the code examples take both cases into account.
Diagram of case 2

Case 2: S is red. In this case we reverse the colors of P and S, and then rotate left at P, turning S into N's grandparent. Note that P has to be black as it had a red child. The resulting subtree has a path short one black node so we are not done. Now N has a black sibling and a red parent, so we can proceed to step 4, 5, or 6. (Its new sibling is black because it was once the child of the red S.) In later cases, we will relabel N's new sibling asS.

void delete_case2(struct node *n)
{
 struct node *s = sibling(n);

 if (s->color == RED) {
  n->parent->color = RED;
  s->color = BLACK;
  if (n == n->parent->left)
   rotate_left(n->parent);
  else
   rotate_right(n->parent);
 }
 delete_case3(n);
}
Diagram of case 3

Case 3: PS, and S's children are black. In this case, we simply repaint S red. The result is that all paths passing through S, which are precisely those paths not passing through N, have one less black node. Because deleting N's original parent made all paths passing through N have one less black node, this evens things up. However, all paths through P now have one fewer black node than paths that do not pass through P, so property 5 (all paths from any given node to its leaf nodes contain the same number of black nodes) is still violated. To correct this, we perform the rebalancing procedure on P, starting at case 1.

void delete_case3(struct node *n)
{
 struct node *s = sibling(n);

 if ((n->parent->color == BLACK) &&
     (s->color == BLACK) &&
     (s->left->color == BLACK) &&
     (s->right->color == BLACK)) {
  s->color = RED;
  delete_case1(n->parent);
 } else
  delete_case4(n);
}
Diagram of case 4

Case 4: S and S's children are black, but P is red. In this case, we simply exchange the colors of S and P. This does not affect the number of black nodes on paths going through S, but it does add one to the number of black nodes on paths going through N, making up for the deleted black node on those paths.

void delete_case4(struct node *n)
{
 struct node *s = sibling(n);

 if ((n->parent->color == RED) &&
     (s->color == BLACK) &&
     (s->left->color == BLACK) &&
     (s->right->color == BLACK)) {
  s->color = RED;
  n->parent->color = BLACK;
 } else
  delete_case5(n);
}
Diagram of case 5

Case 5: S is black, S's left child is red, S's right child is black, and N is the left child of its parent. In this case we rotate right atS, so that S's left child becomes S's parent and N's new sibling. We then exchange the colors of S and its new parent. All paths still have the same number of black nodes, but now N has a black sibling whose right child is red, so we fall into case 6. Neither N nor its parent are affected by this transformation. (Again, for case 6, we relabel N's new sibling as S.)

void delete_case5(struct node *n)
{
 struct node *s = sibling(n);

 if  (s->color == BLACK) { /* this if statement is trivial,
due to case 2 (even though case 2 changed the sibling to a sibling's child,
the sibling's child can't be red, since no red parent can have a red child). */
/* the following statements just force the red to be on the left of the left of the parent,
   or right of the right, so case six will rotate correctly. */
  if ((n == n->parent->left) &&
      (s->right->color == BLACK) &&
      (s->left->color == RED)) { /* this last test is trivial too due to cases 2-4. */
   s->color = RED;
   s->left->color = BLACK;
   rotate_right(s);
  } else if ((n == n->parent->right) &&
             (s->left->color == BLACK) &&
             (s->right->color == RED)) {/* this last test is trivial too due to cases 2-4. */
   s->color = RED;
   s->right->color = BLACK;
   rotate_left(s);
  }
 }
 delete_case6(n);
}
Diagram of case 6

Case 6: S is black, S's right child is red, and N is the left child of its parent P. In this case we rotate left at P, so that S becomes the parent of P and S's right child. We then exchange the colors of P and S, and make S's right child black. The subtree still has the same color at its root, so Properties 4 (Both children of every red node are black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes) are not violated. However, N now has one additional black ancestor: either P has become black, or it was black and Swas added as a black grandparent. Thus, the paths passing through N pass through one additional black node.

Meanwhile, if a path does not go through N, then there are two possibilities:

  1. It goes through N's new sibling SL, a node with arbitrary color and the root of the subtree labeled 3 (s. diagram). Then, it must go through S and P, both formerly and currently, as they have only exchanged colors and places. Thus the path contains the same number of black nodes.
  2. It goes through N's new uncle, S's right child. Then, it formerly went through SS's parent, and S's right child SR (which was red), but now only goes through S, which has assumed the color of its former parent, and S's right child, which has changed from red to black (assuming S's color: black). The net effect is that this path goes through the same number of black nodes.

Either way, the number of black nodes on these paths does not change. Thus, we have restored Properties 4 (Both children of every red node are black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes). The white node in the diagram can be either red or black, but must refer to the same color both before and after the transformation.

void delete_case6(struct node *n)
{
 struct node *s = sibling(n);

 s->color = n->parent->color;
 n->parent->color = BLACK;

 if (n == n->parent->left) {
  s->right->color = BLACK;
  rotate_left(n->parent);
 } else {
  s->left->color = BLACK;
  rotate_right(n->parent);
 }
}

Again, the function calls all use tail recursion, so the algorithm is in-place. In the algorithm above, all cases are chained in order, except in delete case 3 where it can recurse to case 1 back to the parent node: this is the only case where an in-place implementation will effectively loop (after only one rotation in case 3).

Additionally, no tail recursion ever occurs on a child node, so the tail recursion loop can only move from a child back to its successive ancestors. No more than O(log n)loops back to case 1 will occur (where n is the total number of nodes in the tree before deletion). If a rotation occurs in case 2 (which is the only possibility of rotation within the loop of cases 1–3), then the parent of the node N becomes red after the rotation and we will exit the loop. Therefore, at most one rotation will occur within this loop. Since no more than two additional rotations will occur after exiting the loop, at most three rotations occur in total.


  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值