Priority Queues
heap and priority queue
It’s a queue with priority. Each dequeue (pop) operation will remove the object with lowest (highest) priority from the queus.
operations:
-
push(priority, object), or insertion
-
pop()
-
find_min() / find_max()
Naïve implementation: array and sorting push O(N), pop O(N), continuously push O(N log N)
we will see a better structure, leading to each operation O(log N)
Smarter implementation: Heap
A almost full binary tree data structure.
Max-heap: data stored in all of children are smaller or equal to parent
Min-heap: each child is larger or equal to parent
Properties
- For max-heap, the largest number is its root
- Depth is of log N
Operatioms
Insertion: put the element at the end of the heap, bubble up the element until it can’t. Worst O(log N)
Pop: remove the top item, O(log N) update time. Put the last element at the top, then fix the heap structure from top down by comparing the parent with children. (By picking the largest of parent, left node, right node)
Implementation
Can be implemented by an array
if x is the index for the node, then its left child index is 2x + 1 and right child index is 2x + 2
Streaming algorithms
Algorithms that deal with data with unknown length
tipically, you are only allowed to store data with fiexed amount of memory, i.e., you can only memorize very limited number and their arriving order
e.g. For a stream of numbers, design an algorithm whenever a number arrives, output the topc q0 largest numbers you have seen so far. You are only having 1K bytes memory say.
Problem: you have an input of stream of numbers, each time you will read one number from the input. You are asked to output the k’th largest number you have seen so far
K = 3
stream: 1,2,3,4,5,1,2,3,6,…
Selection Tree (Tournament Tree): winner tree
(almost) complete binary tree
N items as leaves
N-1 items as internal node (winner of the matches)
build the tree in O(N) time
replacement (rematch) take O(logN) time, each time you need A rematch
Loser tree
(almost) complete binary tree
N items as leaves
N-1 items as internal node, bigger number win the match
We store the competition loser in each internal node
We store the final winner
Can update the final winner
When edge means communication, loser tree is useful. Like 2 computers communicate
Usage: merge M sorted array
Note: its importance has history reasons. In the past data are stored in disks. So by using loser tree, a lot of time could be saved.