Chapter 7
Table of Content
- Chapter 7
- 1. Priority Queue
- 2. Heaps and Binary Heaps
- 3. Priority Queues & Heaps: Problems & Solutions
- Pro 1: Find the Maximum in Min-Heap
- Pro 2: Delete i-th Indexed Element in Min-Heap
- Pro 3: Find the k-th Smallest Element in Min-Heap
- Pro 4.1: Implement Stack using Heap
- Pro 4.2: Implement Queue using Heap
- Pro 5: Top k Numbers
- Pro 6: Merge k Sorted Lists with Total of n Element
- Pro 7: Find largest N pairs
- Pro 8: Min-Max Heap
- Pro 9: Find Median
- Pro 10: Sliding Window
- Pro 11: the Maximum Depth
- Pro 12: d-ary heap Expression
- 4. Mergeable Heap
1. Priority Queue
Priority Queue is an abstract data type similar to a regular queue or stack data structure in which each element additionally has a “priority” associated with it. An element with high priority is served before an element with low priority.
Applications
- Data compression: Huffman Coding algorithm
- Shortest path algorithms: Dijkstra’s algorithm
- Minimum spanning tree algorithm: Prim’s algorithm
- Event-driven simulation: customers in a line
- Selection problem: finding kth smallest element.
Implementations
Implementation | Insertion | Deletion(Max) | Search (Find Min) |
---|---|---|---|
Unordered Array / List | O(1) | O(n) | O(n) |
Ordered Array / List | O(n) | O(1) | O(1) |
Binary Search Tree | O(logn) (average) | O(logn) (average) | O(logn) (Average) |
Balanced Binary Search Tree | O(logn) | O(logn) | O(logn) |
Binary Heap | O(logn) | O(logn) | O(1) |
2. Heaps and Binary Heaps
Heap forms a complete binary tree with a basic requirement that the value of a node must be ≥ \geq ≥ (or ≤ \leq ≤ ) than the values of its children.
Representing Heap: One possibility is using Arrays. Since heaps are forming complete binary trees, there will not be any wastage of location.
Declaration
struct Heap{
int *array;
int count; //number of elements in heap
int capacity; //size
int heap_type; //Min heap or Max heap
}
Creation
struct Heap *CreateHeap(int capacity, int heap_type){
struct Heap *h=(struct Heap*)malloc(sizeof(struct Heap));
if(h==NULL) return ; //memory error
h->heap_type=heap_type;
h->count=0;
h->capacity=capacity;
h->array=(int*)malloc(sizeof(int) * h->capacity);
if(h->array==NULL) return ; //memory error
return h;
}
Parent of a Node
For a node at ith location, its parent is at ⌊ i − 1 2 ⌋ \lfloor \frac{i-1}{2} \rfloor ⌊2i−1⌋ location. Assume index start at 0.
int Parent(struct Heap *h, int i){
if(i<=0 || i>=h->count) return -1;
return i-1/2;
}
Children of a Node
For a node at ith location, its children are at ( 2 × i + 1 ) (2\times i+1) (2×i+1) and ( 2 × i + 2 ) (2\times i+2) (2×i+2) locations.
int leftChild(struct Heap *h, int i){
return (2*i+1)>=h->count ? -1 : (2*i+1);
}
int rightChild(struct Heap *h, int i{
return (2*i+2)>=h->count ? -1 : (2*i+2);
}
Get the Maximum Element
int GetMax(Heap *h){
return h->count==0 ? -1 : h->array[0];
}
Heapifying an Element
After inserting an element into heap, it may not satisfy the heap property. In that case we need to adjust the locations of the heap to make it heap again. In maxheap to heapify an element, we have to find the maximum of its children and swap it with the current element and continue this process until the error nodes drop (percolate down) to the correct position and heap property is satisfied at every node.
/*heapify the element at location 'i'*/
void PercolateDown(struct Heap *h, int i){
int l, r, max, temp;
l=LeftChild(h, i);
r=RightChild(h, i);
if(l!=-1 && h->array[l]>h->array[i])
max=l;
else
max=i;
if(r!=-1 && h->array[r]>h->array[i])
max=r;
if(max!=i){ //swap
temp=h->array[i];
h->array[i]=h->array[max];
h->array[max]=temp;
}
PercolateDown(h, max);
}
- Time Complexity: O(logn), Space Complexity: O(1).
Delete an Element
Delete element from the root (maximum element) is the only operation supported by standard heap. Replace the root element the last element of the heap (tree) and heapify the new tree.
int DeleteMax(struct Heap *h){
int maxData;
if(h->count==0) return -1;
maxData=h->array[0];
h->array[0]=h->array[h->count-1];
h->count--; //reduce the heap size
PercolateDown(h, 0);
return maxData;
}
- Time Complexity: O(logn).
Insertion
void ResizeHeap(struct Heap *h){
int *array_old=h->array;
h->array=(int*)malloc(sizeof(int) * h->capacity * 2);
if(h->array==NULL) return ; //memory error
for(int i=0; i<h->capacity; i++)
h->array[i]=array_old[i[;
h->capacity*=2;
free(array_old);
}
int Insert(struct Heap *h, int data){
if(h->count==h->capacity)
ResizeHeap(h);
h->count++;
int i=h->count-1;
while(i>=0 && data>h->array[(i-1)/2]){
h->array[i]=h->array[(i-1)/2];
i=(i-1)/2; //bottom-up approach -- percolate up to find a proper position
}
h->array[i]=data;
}
- Time Complexity: O(logn)
Destroy Heap
void DestroyHeap(struct Heap *h){
if(h==NULL) return;
free(h->array);
free(h);
return;
}
Heapify the Array
- One simple approach for building the heap is , take n input items and place them into an empty heap. This can be done with n successive inserts and takes O(nlogn) in the worst case.
- Since leaf nodes always satisfy the heap property and do not need to care for them, it should be enough if we heapify the non-leaf nodes. The linear time bound of building heap can be shown by computing the sum of height of all nodes which means that building the heap operation can be done in linear time O(n) by applying a PercolateDown() function to nodes in reverse level order.
void BuildHeap(struct Heap *h, int A[], int n){
if(h==NULL) return;
while(n>h->capacity)
ResizeHeap(h);
for(int i=0; i<n; i++)
h->array[i]=A[i];
h->count=n;
for(int i=(i-1)/2; i>=0; i--)
PercolateDown(h, i);
}
- Here give the proof that for a complete binary tree of height
h
h
h the sum of the height of all node is
O
(
n
−
h
)
O(n-h)
O(n−h).
S = ∑ i = 0 h 2 i ( h − i ) S = \sum_{i=0}^h 2^i(h-i) S=i=0∑h2i(h−i)
S = h + 2 ( h − 1 ) + 4 ( h − 2 ) + . . . + 2 h − 1 ( 1 ) S=h+2(h-1)+4(h-2)+...+2^{h-1}(1) S=h+2(h−1)+4(h−2)+...+2h−1(1)
Multiply with 2 on both sides gives: 2 S = 2 h + 4 ( h − 1 ) + 8 ( h − 2 ) + . . . + 2 h 1 2S=2h+4(h-1)+8(h-2)+...+2^h1 2S=2h+4(h−1)+8(h−2)+...+2h1
Subtract two equations,
S = ( 2 h + 1 − 1 ) − ( h − 1 ) S=(2^{h+1}-1)-(h-1) S=(2h+1−1)−(h−1)
And, we already know that total number of nodes n n n in a complete binary tree with height h h h is n = 2 h + 1 − 1 n=2^{h+1}-1 n=2h+1−1.This gives us: h = l o g ( n + 1 ) h=log(n+1) h=log(n+1)
Finally, replacing 2 h + 1 − 1 2^{h+1}-1 2h+1−1 with n n n, gives: S = n − ( h − 1 ) = O ( n − l o g n ) = O ( n − h ) ≈ O ( n ) S=n-(h-1)=O(n-logn)=O(n-h)\approx O(n) S=n−(h−1)=O(n−logn)=O(n−h)≈O(n)
Heapsort
The heap sort algorithm inserts all elements from an unsorted array into a heap, then removes them from the root of a heap until the heap is empty.
void Heapsort(int A[], int n){
struct Heap *h=CreateHeap(n);
int old_size, i ,temp;
BuildHeap(h, A, n);
old_size=h->count;
for(i=n-1; i>0; i--){
temp=h->array[0]; //swap
h->array[0]=h->array[h->count-1];
h->array[0]=array[h->count-1];
h->count--;
PercolateDown(h, 0); //heapify
}
h->count=old_size;
}
- Time Complexity: O(nlogn).
3. Priority Queues & Heaps: Problems & Solutions
Pro 1: Find the Maximum in Min-Heap
Solution :
int FindMaxinMinHeap(struct Heap *h){
int Max=-1;
for(int i=(h->count+1)/2; i<h->count; i++)
Max=fmax(Max, h->array[i]);
return Max;
}
- Time Complexity: O(n).
Pro 2: Delete i-th Indexed Element in Min-Heap
Solution :
int Delete(struct Heap *h, int i){
if(i>n) return; //wrong position
int key=h->array[i];
h->array[i]=h->array[h->count-1];
h->count--;
PercolateDown(h, i);
return key;
}
- Time Complexity: O(logn).
Pro 3: Find the k-th Smallest Element in Min-Heap
Solution 1:
- Perform deletion k times from min-heap.
- Time Complexity: O( klogn).
Solution 2: reduce complexity
- Assume that the original min-heap called Ori and the auxiliary min-heap us called Aux. Initially, the element at the top of Ori, the minimum one, is inserted into Aux. Here we don’t do the operation of DeleteMin() with Ori, instead we do DeleteMin() with Aux.
Heap Ori;
Heap Aux;
int FindKthSmallest(int k){
int heapElement; //assume data type is integer
int count=1;
Aux.Insert(Ori.GetMin()); //return root in O(1)
while(true){
/*delete min from Aux*/
heapElement=Aux.DeletMin(); //O(logk)
if(++count==k)
return heapElement;
else{
/*insert left & right children in Ori into Aux*/
Aux.Insert(heapElement.LeftChild()); //LeftChild()-->O(1), Insert()-->O(logk)
Aux.Insert(heapElement.RightChild());
}
}
}
- Every while-loop iteration gives one smallest element in the auxiliary heap and we need k loop to get the result. Because the size of the auxiliary is always less than k (every loop iteration the size of auxiliary heap increases by one) and the original heap Ori has no modified operation during the finding, the running time is O(klogk).
- Note: the above algorithm is useful if the k value is too small compared to n. If the k value is approximately equal to n, then we can simply sort the array (using linear sorting algorithm).
Pro 4.1: Implement Stack using Heap
Solution :
- To implement a stack using a priority queue PQ (using min-heap), let us assume that we are using one extra integer variable c as the priority while inserting / deleting the elements from PQ
/*initialize c to any known value*/
void push(int element){
PQ.Insert(c,element);
c--;
}
int pop(){
return PQ.DeleteMin(); //we could also increment c back when popping
}
int top(){
return PQ.GetMin();
}
int size(){
return PQ.Size();
}
int isempty(){
return PQ.IsEmpty();
}
/*we could use the negative of the current system time instead of c*/
void push(int element){
PQ.Insert(-gettime(), element);
}
Pro 4.2: Implement Queue using Heap
Solution :
void push(int element){
PQ.Insert(c,element); //PQ.Insert(gettime(), element);
c++;
}
int pop(){
return PQ.DeleteMin();
}
Pro 5: Top k Numbers
Given a big file containing billions of numbers, how can you find the 10 maximum number from that file?
Solution : Priority Queues
- Always remember that when you need to find top n elements, the best data structure to use is priority queues.
- One solution for this problem is to divide the data in sets of 1000 elements (let’s say 1000) and make a heap of them, and then take 10 elements from each heap one by one. Finally heap sort all the sets of 10 elements and take the top 10 among those. But the problem in this approach is where to store 10 elements from each heap (large amount of memory). The answer is reusing the top 10 elements from the earlier heap in subsequent elements.
- Supplement :
- Q: To find the top max k numbers, we choose build max-heap or min-heap?
A: First we read 10 numbers to create Min-Heap. Then traverse the subsequent numbers and compare with the root of the heap. Replace the root by the new number and heapify the tree if coming number is greater than the root of the heap, otherwise discard the coming number. After those process, the remain numbers in the heap is the final top k numbers. Because of the heap property that we could only access the top node of the heap, if we build max-heap, we cannot judge the new coming number’s position (whether discard or keep it). - Note that we can build a heap at a time complexity in O(m) (bottom-up approach) and modify a heap at a time complexity in O(logm). The final time complexity is O(m+nlogm)=O(nlgom)=O(n) (we treat logm as a constant).
- Q: To find the top max k numbers, we choose build max-heap or min-heap?
Pro 6: Merge k Sorted Lists with Total of n Element
We are given k sorted lists with total n inputs in all the lists. Given an algorithm to merge them into one single sorted list.
Solution 1: one by one
- Take the first list and merge it with the second list by scanning the elements in two lists one times (two pointer approach). Then merge the output with the third list. Continue this process until all the lists are merged to one list.
- Total time Complexity:
2
n
k
+
3
n
k
+
4
n
k
+
.
.
.
+
k
n
k
=
∑
i
=
2
n
i
n
k
≈
n
k
2
n
≈
O
(
n
k
)
\frac{2n}{k}+\frac{3n}{k}+\frac{4n}{k}+...+\frac{kn}{k}= \sum_{i=2}^n \frac{in}{k}\approx\frac{nk^2}{n}\approx O(nk)
k2n+k3n+k4n+...+kkn=i=2∑nkin≈nnk2≈O(nk)
Space Complexity: O(1).
Solution 2: divide into pairs
- Divide the lists into pairs and take two lists at a time and merge them so that the total elements parsed for all lists is O(n). This operation output k/2 lists. Repeat the step until the lists becomes one.
- Time Complexity: O(nlogk), Space Complexity: O(n).
Solution 3: heap (similar to k pointers)
- Build the min-heap with all the first elements from each list. In each step, extract the minimum element of heap and add it at the beginning of the output. Add the next element from the list of the one extracted. Repeat step until all the elements are completed from all the lists.
- Time Complexity: O(nlogk), Space Complexity: O(k)
Pro 7: Find largest N pairs
Given 2 arrays A and B each with n elements. Given an algorithm for finding largest n pairs (A[i], B[j]).
Solution :
- Algorithm:
1. heapify A and B. O(n) approach 2. keep on deleting the elements from both heaps. Each step take O(logn), total take O(nlogn).
Pro 8: Min-Max Heap
Design a data structure which supports the following operation:
Operation | Init() | Insert() | GetMin() | GetMax() | DeleteMin() | DeleteMax() |
---|---|---|---|---|---|---|
Complexity | O(n) | O(logn) | O(1) | O(1) | O(logn) | O(logn) |
Solution : two heaps
- Use two heaps which called Hmin and Hmax with mutual pointers in both of the heap and point to the same element.
Pro 9: Find Median
Design a heap data structure that supports finding the median.
Solution : two heaps
- Median heaps are the variant of heaps that give access to the median element. A median heap can be implemented using two heaps, each containing half the element. One is max-heap containing the small part of elements; the other is a min-heap containing the large part of elements. If the total number of elements is even, the size of two heaps may be equal; if it is odd, the max-heap will contain one more element than the min-heap.
Pro 10: Sliding Window
Given array A[] with sliding window of size w which is moving from the very left of the array to the very right. Assume that we can only see the w number in the window. Each time the sliding window moves rightwards by one position. Find a good optimal way to get B[i] which is the maximum value of from A[i] to A[i+w-1].
Solution 1: brute force solution
- Every time the window is moved we can search for a total of w element in the window. Time complexity: O(nw).
Solution 2: heap
- As the window slides to the right, some elements in the heap might not be valid anymore. Since we only remove elements that re out of window’s range, we would need to keep track of the elements’ indices too. Time Complexity: O(nlogw).
Solution 3: double-ended queue
- The trick is to find a way such that the largest element in the window would always appear in the front of the queue. Try to break away from the thought that maintain the queue size the same as the window’s size and think out of box.
void MaxSlidingWindow(int A[], int n, int w, int B[]){
struct DoubleEndQueue *q=CreateDoubleEndQueue();
for(int i=0; i<w; i++){
while(!IsEmptyQueue(q) && A[i]>=A[QBack(q)])
PopBack(q);
PushBack(q, i);
}
for(int i=w; i<n; i++){
B[i-w]=A[QFront(q)];
while(!IsEmptyQueue(q) && A[i]>=A[QBack(q)])
PopBack(q);
while(!IsEmptyQueue(q) && QFront(q)<=i-w)
PopFront(q)
PushBack(q, i);
}
B[n-w]=A[QFront(q)];
}
Pro 11: the Maximum Depth
A complete binary min-heap is made by including each integer in [ 1 , 1023 ] [1,1023] [1,1023] exactly once. The depth of a node in the heap is the length of path from the root of the heap to that node. Thus, the root is at depth 0. What the maximum depth at integer 9?
Solution :
- Fix the element i at ith level and arrange the numbers 1 to i-1 to the levels above. Hence the maximum depth at which integer 9 can appear is 8. Also remember that Heap forms a complete binary tree.
Pro 12: d-ary heap Expression
How to represent a d-ary heap with n elements in an array? What are the expressions for determining the parent of a given element, Parent(i), and a jth child of a given element, Child(i, j), where 1 ≤ j ≤ d 1\leq j\leq d 1≤j≤d?
Solution :
- The following expressions determine the parent and jth child of element i,
P a r e n t ( i ) = ⌊ i + d − 2 d ⌋ Parent(i)=\lfloor \frac{i+d-2}{d} \rfloor Parent(i)=⌊di+d−2⌋ C h i l d ( i , j ) = ( i − 1 ) × d + j + 1 Child(i,j)=(i-1)\times d+j+1 Child(i,j)=(i−1)×d+j+1
4. Mergeable Heap
If we have two min-heaps, there is no efficient way to combine them into a single min-heap. For solving this problem efficiently, we can use mergeable heaps which support efficient union operation. It is a data structure that supports the following operations:
create_heap() //create an empty heap
insert(H, X, K) //insert an item X with key K into a heap H
find_min(H) //return item with min key
delete_min(H) //return and remove
union(H1, H2) //merge heaps H1 and H2
decrease_key(H, X, K) //assign item X with a smaller key K
delete(H, X) //remove item X
Example of mergeable heaps:
- Binomial Heaps
- Fibonacci Heaps