算法导论第六章 堆排序
一 概念
1.(二叉)堆是数据结构的一种数组对象(堆是数组而不是一般的树)
2.但它可以视为一棵完全二叉树,二叉树的层次遍历对就数组元素的顺序对应,树根对应A[1],
对于第i个元素,有以下主要关系:
PARENT(i) return i/2 // its parent
LEFT(i) return 2*i // its left child
RIGHT(i) return 2*i+1
3.最大堆(根结点最大):
A[PARENT(i)] >= A[i]; // 父结点不小于子结点
最小堆:
A[PARENT(i)] <= A[i]; // 父结点不大于子结点
4.堆的高度:
从本结点到叶子结点的最长简单下降路径上边的数目,定义堆的高度为树根的高度,叶子结点高度为0,对于n个结点的堆,高度为 floor(lgn)即lgn向下取整数
二。堆的数据结构
1. A[N]: 堆数组
length[A]: 数组中元素的个数
heap-size[A]: 存放在A中的堆的元素个数
2. 操作:
(1)MAX—HEAPIFY(A,i)过程O(lgn),保持最大堆性质的关键
(2)BUILD—MAX—HEAP 过程O(n),构造最大堆
(3)HEAPSORT 运行时间O(n*lgn),对数组进行排序
(4)堆运用于优先级队列:
MAX-HEAP-INSERT(A, val) // insert a element to A
HEAP-EXTRACT-MAX(A) // remove and return the max element of A
HEAP-INCREASE-KEY(A, i, ival) // increase the ith element's val to ival
MAXIMUM(A) // return the element which has the greatest val
三,程序实现
#include <stdio.h>
#include <stdlib.h>
#define PARENT(i) (i) >> 1 // parent of i
#define LEFT(i) (i) << 1 // left child of i
#define RIGHT(i) ((i)<<1) + 1 // right child of i
#define MINVAL (signed)(1 << 31)
static int heap_size = 0;
static int length = 0;
/*
* 以LEFT(i)和RIGHT(i)为根的两棵二叉树都是已建好的最大堆
* 但A[i]可能小于其孩子,此时对其"下降"调整使满足最大堆性质
* recusion Version
*/
void max_heapify(int *heap, int i)
{
int l, r, largest;
l = LEFT(i);
r = RIGHT(i);
if (l <= heap_size && heap[l] > heap[i])
largest = l;
else
largest = i;
if (r <= heap_size && heap[r] > heap[largest])
largest = r;
if (largest != i) { // need to be adjusted
int t = heap[i];
heap[i] = heap[largest];
heap[largest] = t;
max_heapify(heap, largest);
}
}
/* not recursion version */
void max_heapify02(int *heap, int i)
{
int l, r, largest;
int t;
while (1) {
l = LEFT(i);
r = RIGHT(i);
if (l <= heap_size && heap[l] > heap[i])
largest = l;
else
largest = i;
if (r <= heap_size && heap[r] > heap[largest])
largest = r;
if (largest != i) {
t = heap[largest];
heap[largest] = heap[i];
heap[i] = t;
i = largest;
} else
break;
}
}
void build_max_heap(int *heap)
{
int i;
for (i = heap_size / 2; i >= 1; i--)
max_heapify(heap, i);
}
void heapsort(int *heap)
{
int i, t;
build_max_heap(heap);
for (i = length; i >= 2; i--) {
t = heap[i];
heap[i] = heap[1];
heap[1] = t;
heap_size--;
max_heapify(heap, 1);
}
heap_size = length;
}
int heap_maximum(int *heap)
{
return heap[1];
}
int heap_extract_max(int *heap)
{
int max;
if (heap_size < 1) {
fprintf(stderr, "heap underflow\n");
exit(0);
}
max = heap[1];
heap[1] = heap[heap_size];
heap_size--;
max_heapify(heap, 1);
return max;
}
/* the key must not less then the original val heap[i] */
void heap_increase_key(int *heap, int i, int key)
{
int t;
if (i < 1 || i > heap_size) {
fprintf(stderr, "invalid position of %d\n", i);
exit(-1);
}
if (key < heap[i]) {
fprintf(stderr, "new key is smaller than current key\n");
exit(-1);
}
while (i > 1 && key > heap[PARENT(i)]) {
heap[i] = heap[PARENT(i)];
i = PARENT(i);
}
heap[i] = key;
}
void max_heap_insert(int *heap, int key)
{
heap_size++;
heap[heap_size] = MINVAL;
heap_increase_key(heap, heap_size, key);
}
void print_heap(int *heap)
{
int i;
for (i = 1; i <= heap_size; i++)
printf("%3d ", heap[i]);
printf("\n"); } int main() { //int arr[] = { 0, 10, 9, 2, 100, 300, 56, 67, 3, 2, 2, 20, 10 }; int arr[] = { 0, 100, 2, 20, 10 }; int i, x; length = sizeof(arr) / sizeof(*arr) - 1; heap_size = length; printf("original array:\n"); print_heap(arr); build_max_heap(arr); printf("max heap:\n"); print_heap(arr); /* printf("after sorted:\n"); heapsort(arr); print_heap(arr); */ printf("after extract max:\n"); heap_extract_max(arr); print_heap(arr); x = 33; printf("insert %d to the heap:\n", x); max_heap_insert(arr, x); print_heap(arr); return 0; }
四,部分习题解答:
6.1-1: 2^(h+1) 2^h
6.1-2: 含n个元素的堆的高度为 lgn 取下整数
1+2+4+...+2^(h-1) < n <= 1+2+4+..2^h
(1-2^h)/(1-2) < n <= (1-2^(h+1))/(1-2)
2^h-1 < n <= 2^(h+1)-1 --> 2^h <= n < 2^(h+1)
--> h <= lgn < h+1 --> h = floor(lgn)
6.1-3: 由堆的定义可得
6.1-4:叶子结点
6.1-5: No
6.1-6:no
6.1-7: 最后一个结点n的父结点为n/2,其父结点之后为叶子结点
6.2-2:minmum heap
MIN-HEAPIFY(A, i)
l = LEFT(i)
r = RIGHT(i)
if l <= heap-size[A] and A[l] < A[i]
smallest = l;
else
smallest = i;
if r <= heap-size[A] and A[r] < A[smallest]
smallest = r
if smallest != i
A[i] <-> A[smallest]
MIN-HEAPIFY(A, smallest)
6.2-3:
程序执行一次测试立刻退出,不改变堆
6.2-4:
一次测试后立刻退出,堆不变
6.2-5:
见上面代码 not recursion ersion
6.2-6:
一直不知道证明???
6.3-2:
从下至上建堆,可以利用建好的堆调用MAX-HEAPIFY
6.3-3:
n <= 2^(h+1)
第0层最多 2^h
第1层最多 2^(h-1)
...
第h层(根层) 1 >= n / (2^(h+1))
6.5-7:
因为insert程序调用HEAP—INCREASE—KEY(),而increase函数要求插入的key大于原来的值,所以先把原来的值设为最小值。
6.5-8:
见堆排序k路合并
思考题
6-1: 用插入法建堆
BUILD-MAX-HEAP02(A)
heap-size[A] = 1
for i = 2 to length[A]
MAX-HEAP-INSERT(A, A[i])
MAX-HEAP-INSERT(A, A[i])
同上
a) 不一样 A = { 1, 2, 3 }
6-2: d叉堆的分析
d叉堆与二叉堆类似,只是其中每一个非叶子结点有d个子女
PARENT(i) = (i-2)/d + 1
LEFTMOST(i) = d*(i-1) + 2
//RIGHTMOST(i) = d*i+1
CHILD(i, j) = d * (i-1) + j + 1
a) 根结点为A[1],根结点的孩子为A[2],A[3]...A[d+1]
b) lgn / lgd
c) O(lgn/lgd * d)
EXTRACT-MAX(A)
max = A[1]
A[1] = A[heap-size[A]]
heap-size[A] = heap-size[A]-1
MAX-HEAPIFY(A, 1)
其中:
MAX-HEAPIFY(A, i)
l = LEFTMOST(i)
r = RIGHTMOST(i)
largest = i;
while (l <= r && l <= heap-size[A])
if (A[l] > A[largest])
largest = l
l <- l+1
if largest != i
A[i] <-> A[largest]
MAX-HEAPIFY(A, largest)
d) lgn / lgd
MAX-HEAP-INSERT(A, key)
heap-size[A] <- heap-size[A] + 1
A[heap-size[A]] <- 无穷小
HEAP-INCREASE-KEY(A, heap-size[A], key)
e) lgn / lg d
HEAP-INCREASE-KEY(A, i, key)
//if key < A[i]
// then error "new key is samller than current key"
// A[i] <- key
//while i > 1 and A[PARENT(i)] < A[i]
// do exchange A[i] <-> A[PARENT(i)]
// i <- PARENT(i)
key <- MAX(A[i], key)
while i > 1 and A[PARENT(i)] < key
do A[i] <- A[PARENT(i)]
i = PARENT(i)
A[i] = key