Naive algorithm:
A disjoint-set data structure supports the following operations:
MakeSet(x) creates a singleton set {x}
Find(x) returns ID of the set containing x:
if x and y lie in the same set, then
Find(x) = Find(y)
otherwise, Find(x) != Find(y)
Union(x, y) merges two sets containing x and y
Preprocess(maze) |
for each cell c in maze: MakeSet(c ) for each cell c in maze: for each neighbor n of c: Union(c , n) |
IsReachable(A,B ) |
return Find(A) = Find(B) |
Use the smallest element of a set as its ID.
Use array smallest[1 . . . n]:
smallest[i] stores the smallest element in the set i belongs to.
Example | |||||||||
{9,3,2,4,7} {5} {6,1,8} 1 2 3 4 5 6 7 8 9
|
MakeSet(i) |
smallest[i] ← i |
Find(i) |
return smallest[i] |
Running time: O(1)
Union(i, j) |
i_id ← Find(i) j_id ← Find(j) if i_id = j_id: return m ← min(i_id, j_id) for k from 1 to n: if smallest[k] in {i_id,j_id}: smallest[k] ← m |
Running time: O(n)
Current bottleneck: Union
What basic data structure allows for efficient merging?
Linked list!
Idea: represent a set as a linked list, usethe list tail as ID of the set
Efficient implementations:
Represent each set as a rooted tree
ID of a set is the root of the tree
Use array parent[1 . . . n]: parent[i] is the parent of i, or i if it is the root
MakeSet(i) |
parent[i] ← i |
Running time: O(1)
Find(i) |
while i ̸= parent[i]: i ← parent[i] return i |
Running time: O(tree height)
How to merge two trees?
Hang one of the trees under the root of the other one
Which one to hang?
A shorter one, since we would like to keep the trees shallow
1. When merging two trees we hang a shorter one under the root of a taller one
2. To quickly find a height of a tree, we will keep the height of each subtree inan array rank[1 . . . n]: rank[i] is the height of the subtree whose root is i
3. (The reason we call it rank, but notheight will become clear later)
4. Hanging a shorter tree under a taller oneis called a union by rank heuristic
MakeSet(i) |
parent[i] ← i rank[i]← 0 |
Find(i) |
while i ̸= parent[i]: i ← parent[i] return i |
Union(i, j) |
i_id ← Find(i) j_id ← Find(j) if i_id = j_id: return parent[j_id] ← i_id else: parent[i_id] ← j_id rank[j_id] ← rank[j_id] + 1 |
Lemma |
The height of any tree in the forest is atmost log2n. |
Follows from the following lemma.
Lemma |
Any tree of height k in the forest has at least2knodes. |
Summary |
The union by rank heuristic guarantees thatUnion and Find work in time O(log n). |
Path compression:
Height ≤Rank
1. When using path compression, rank[i] is no longer equal to the height of the subtree rooted at i.
2. Still, the height of the subtree rooted at i is at most rank[i].
3. And it is still true that a root node of rank k has at least 2k nodes in its subtree: a root node is not affected by path compression
Summary
Represent each set as a rooted treeUse the root of the set as its ID
Union by rank heuristic: hang a shortertree under the root of a taller one
Path compression heuristic: whenfinding the root of a tree for a particularnode, reattach each node from thetraversed path to the root
Amortized running time: O(log* n)(constant for practical values of n)