堆定义
根据维基百科:
堆是一种特别的完全二叉树。若是满足以下特性,即可称为堆:“给定堆中任意节点P和C,若P是C的父节点,那么P的值会小于等于(或大于等于)C的值”。若父节点的值恒小于等于子节点的值,此堆称为最小堆(min heap);反之,若父节点的值恒大于等于子节点的值,此堆称为最大堆(max heap)。
在队列中,调度程序反复提取队列中第一个作业并运行,因为实际情况中某些时间较短的任务将等待很长时间才能结束,或者某些不短小,但具有重要性的作业,同样应当具有优先权。堆即为解决此类问题设计的一种数据结构。
堆分为最大堆和最小堆。
堆的常用方法:
- 构建优先队列
- 支持堆排序
- 快速找出一个集合中的最小值(或者最大值)
完全二叉树和满二叉树区别如下:
由于堆是完全二叉树,所以可以用顺序数组来表示,如下图。
container/heap
定义
heap定义了heap.Interface,并且提供堆方法:Init/Push/Pop/Remove/Fix。
container/heap
为最小堆,即每个节点的值都小于它的子树的所有元素的值(A heap is a tree with the property that each node is the minimum-valued node in its subtree)。
Package heap provides heap operations for any type that implements heap.Interface. A heap is a tree with the property that each node is the minimum-valued node in its subtree. The minimum element in the tree is the root, at index 0. A heap is a common way to implement a priority queue. To build a priority queue, implement the Heap interface with the (negative) priority as the ordering for the Less method, so Push adds items while Pop removes the highest-priority item from the queue. The Examples include such an implementation; the file example_pq_test.go has the complete source.
heap的interface如下所示:
// heap
type Interface interface {
sort.Interface
Push(x interface{}) // add x as element Len()
Pop() interface{} // remove and return element Len() - 1.
}
//sort
type Interface interface {
// Len is the number of elements in the collection.
Len() int
// Less reports whether the element with
// index i should sort before the element with index j.
Less(i, j int) bool
// Swap swaps the elements with indexes i and j.
Swap(i, j int)
}
使用(优先级队列)
container/heap包可以用来构造优先级队列。
go源码里面实现了优先级队列,路径参考:go/1.14/libexec/src/container/heap/example_pq_test.go
代码如下:
package heap_test
import (
"container/heap"
"fmt"
)
// An Item is something we manage in a priority queue.
// 优先级队列中管理的Item
type Item struct {
value string // The value of the item; arbitrary.
priority int // The priority of the item in the queue.
// The index is needed by update and is maintained by the heap.Interface methods.
index int // The index of the item in the heap.
}
// A PriorityQueue implements heap.Interface and holds Items.
type PriorityQueue []*Item
func (pq PriorityQueue) Len() int { return len(pq) }
func (pq PriorityQueue) Less(i, j int) bool {
// We want Pop to give us the highest, not lowest, priority so we use greater than here.
// 因为想要从高到低输出,所以这边反着写
return pq[i].priority > pq[j].priority
}
func (pq PriorityQueue) Swap(i, j int) {
pq[i], pq[j] = pq[j], pq[i]
pq[i].index = i
pq[j].index = j
}
func (pq *PriorityQueue) Push(x interface{}) {
// 增加元素
n := len(*pq)
item := x.(*Item)
item.index = n
*pq = append(*pq, item)
}
func (pq *PriorityQueue) Pop() interface{} {
// pop出元素
old := *pq
n := len(old)
item := old[n-1]
old[n-1] = nil // avoid memory leak
item.index = -1 // for safety
*pq = old[0 : n-1]
return item
}
// update modifies the priority and value of an Item in the queue.
// 更新Item的值和优先级,同时更新堆
func (pq *PriorityQueue) update(item *Item, value string, priority int) {
item.value = value
item.priority = priority
heap.Fix(pq, item.index)
}
// This example creates a PriorityQueue with some items, adds and manipulates an item,
// and then removes the items in priority order.
func Example_priorityQueue() {
// Some items and their priorities.
items := map[string]int{
"banana": 3, "apple": 2, "pear": 4,
}
// Create a priority queue, put the items in it, and
// establish the priority queue (heap) invariants.
pq := make(PriorityQueue, len(items))
i := 0
for value, priority := range items {
pq[i] = &Item{
value: value,
priority: priority,
index: i,
}
i++
}
heap.Init(&pq)
// Insert a new item and then modify its priority.
item := &Item{
value: "orange",
priority: 1,
}
heap.Push(&pq, item)
pq.update(item, item.value, 5)
// Take the items out; they arrive in decreasing priority order.
for pq.Len() > 0 {
item := heap.Pop(&pq).(*Item)
fmt.Printf("%.2d:%s ", item.priority, item.value)
}
// Output:
// 05:orange 04:pear 03:banana 02:apple
}
heap源码分析
heap.init
// Init establishes the heap invariants required by the other routines in this package.
// Init is idempotent with respect to the heap invariants
// and may be called whenever the heap invariants may have been invalidated.
// The complexity is O(n) where n = h.Len().
// 初始化堆
func Init(h Interface) {
// heapify
n := h.Len()
for i := n/2 - 1; i >= 0; i-- {
down(h, i, n)
}
}
Init方法初始化一个堆。关键在于down函数。
func down(h Interface, i0, n int) bool {
i := i0
for {
j1 := 2*i + 1
if j1 >= n || j1 < 0 { // j1 < 0 after int overflow
break
}
j := j1 // left child
if j2 := j1 + 1; j2 < n && h.Less(j2, j1) {
j = j2 // = 2*i + 2 // right child
}
if !h.Less(j, i) {
break
}
h.Swap(i, j)
i = j
}
return i > i0
}
down函数的功能:给定类型,需要down(下沉)的元素在数组中的索引,heap的长度,将该元素下沉到该元素对应的子树合适的位置,从而满足该子树为最小堆的要求。
down函数的实现:任选一个元素 i ,将其与它的子节点 2i+1 和 2i+2比较,如果元素 i 比它的子节点小,则将元素 i 与两个子节点中较小的节点交换(j),从而保证满足最小树的要求(第一次down);子节点 j 可能也有它的子节点,继续比较、交换,直到数组末尾,或者元素 i 比它的两个子节点都小,跳出循环()。
为什么元素 i 比它的两个子节点都小,就可以跳出循环,不再继续下去呢?这是由于,在Init函数中,第一个开始down(下沉)的元素是第 n/2 - 1 个,可以保证总是从最后一棵子树开始down(如前图,n=8或者n=9, n/2-1总是为4),因此可以保证Init->down时,如果元素 i 比它的两个子节点都小,那么该元素对应的子树,就是最小堆。
Init在遍历完毕后,可以保证,数组是一个最小堆。
heap.push
// Push pushes the element x onto the heap.
// The complexity is O(log n) where n = h.Len().
func Push(h Interface, x interface{}) {
h.Push(x)
up(h, h.Len()-1)
}
可以看到会先调用用户实现的Push接口,即把数据放在数组最后。然后再调用up,把这个数”上浮“,调整为最小堆。
参考文章:
1. https://www.jianshu.com/p/6b526aa481b1
2. https://ieevee.com/tech/2018/01/29/go-heap.html
3. https://zh.wikipedia.org/wiki/%E5%A0%86%E7%A9%8D