[数据结构]堆

最新推荐文章于 2024-09-19 15:57:43 发布

Protein_zmm

最新推荐文章于 2024-09-19 15:57:43 发布

阅读量357

点赞数 2

分类专栏：数据结构文章标签：数据结构算法

本文链接：https://blog.csdn.net/weixin_51304981/article/details/125744129

版权

数据结构专栏收录该内容

14 篇文章 0 订阅

订阅专栏

本文详细介绍了堆的概念，包括大堆和小堆的定义与性质，并通过实例解释了插入和删除操作。接着，提供了大堆和小堆的C语言实现，包括向上调整和向下调整算法。此外，讨论了堆在Top-k问题和堆排序中的应用，分析了不同构建堆的方法和时间复杂度。堆排序中，通过大堆实现升序排列，展示了堆在高效排序中的优势。

摘要由CSDN通过智能技术生成

一、堆的概念及结构

堆是一颗完全二叉树 适用于数组存储
大堆：树中一个树及其子树中，父亲都大于等于孩子则称为大堆。将根节点最大的堆叫做最大堆或大根堆。
小堆：树中一个树及其子树中，父亲都小于等于孩子则称为小堆。将根节点最小的堆叫做最小堆或小根堆。
堆的性质：
堆中某个节点的值总是不大于或不小于其父节点的值；

小堆的逻辑结构与存储结构（物理结构）：
在这里插入图片描述
逻辑结构：我们想象出来的 – 完全二叉树
物理结构：实际在内存中存储的结构 – 数组
堆不一定是有序的（左孩子可以小于右孩子，也可以大于右孩子）

二、堆的实现

2.1 大堆的插入分析

在这里插入图片描述
插入数据x=8时，可以直接插入在后面

但是如果x=60，就不能直接插入在后面，否则就不是大堆了
在这里插入图片描述
堆插入数据堆其他节点没有影响，只可能会影响从他到根节点路径上节点关系

2.2 向上调整算法

调整过程：
在这里插入图片描述
结束调整时：

while （parent >= 0）可以作为循环条件吗？
// 不能

因为更新条件是：
child = parent;
parent = (child - 1) / 2;
当parent为0的时候，给了child，此时child也为0，再次计算parent，（0 - 1）/ 2 == 0，此时将永远循环下去
所以应该改为，child = 0时，终止程序

void AdjustUp(int* a, int child) 
{
	assert(a);
	int parent = (child - 1) / 2;
	// while (parent >= 0) 不能这样子
	while (child > 0)
	{
		if (a[child] > a[parent])
		{
			HPDataType tmp = a[child];
			a[child] = a[parent];
			a[parent] = tmp;
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}

2.3 小堆的删除分析

删除堆中的元素指的是删除堆顶元素——选出最值（最小值/最大值）
在这里插入图片描述
向下调整，将他调整为堆
与左右孩子中较小的那个交换
结束条件：（有一个成立即可）
1、父亲比较小的孩子还小（含等于）则停止
2、调整到叶子节点

2.4 向下调整算法

void AdjustDown(int* a, int n, int parent) // n表示数组大小
{
	assert(a);
	int child = parent * 2 + 1;
	while (child < n) // 首先child要在数组范围内
	{
		// 小堆尽量用小于号，这样子换成大堆就直接小于号改为大于号
		if (child + 1 < n && a[child + 1] > a[child]) // 保证右孩子存在
		{
			child++;
		}
		if (a[child] > a[parent]) // 小堆大堆看小于还是大于号
		{
			Swap(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}

2.5 大堆实现

#include "Heap.h"
void Swap(HPDataType* px, HPDataType* py)
{
	HPDataType tmp = *px;
	*px = *py;
	*py = tmp;
}
void HeapInit(HP* hp)
{
	assert(hp);
	hp->a = NULL;
	hp->size = hp->capacaty = 0;
}
void HeapDestroy(HP* hp)
{
	assert(hp);
	free(hp->a);
	hp->size = hp->capacaty = 0;
}
void AdjustUp(int* a, int child)
{
	assert(a);
	int parent = (child - 1) / 2;
	// while (parent >= 0) 不能这样子
	while (child > 0)
	{
		if (a[child] > a[parent])
		{
			/*HPDataType tmp = a[child];
			a[child] = a[parent];
			a[parent] = tmp;*/
			Swap(&a[child], &a[parent]);
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}
void AdjustDown(int* a, int n, int parent) // n表示数组大小
{
	assert(a);
	int child = parent * 2 + 1;
	while (child < n) // 首先child要在数组范围内
	{
		// 小堆尽量用小于号，这样子换成大堆就直接小于号改为大于号
		if (child + 1 < n && a[child + 1] > a[child]) // 保证右孩子存在
		{
			child++;
		}
		if (a[child] > a[parent])
		{
			Swap(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}
void HeapPush(HP* hp, HPDataType x) 
{
	assert(hp);
	if (hp->size == hp->capacaty)
	{
		size_t newCapacity = hp->capacaty == 0 ? 4 : hp->capacaty * 2;
		HPDataType* tmp = realloc(hp->a, sizeof(HPDataType) * newCapacity);
		if (tmp == NULL)
		{
			printf("realloc fail");
			exit(-1);
		}
		hp->a = tmp;
		hp->capacaty = newCapacity;
	}
	// push要求，插入一个x后，仍然是堆
	hp->a[hp->size] = x; // 如6个数据，下标为0-5，下标为6的地方放置x
	hp->size++;
	AdjustUp(hp->a, hp->size - 1);
}
void HeapPrint(HP* hp)
{
	for (int i = 0; i < hp->size; ++i)
	{
		printf("%d ", hp->a[i]);
	}
	puts("");
}
bool HeapEmpty(HP* hp)
{
	assert(hp);
	return hp->size == 0;
}
int HeapSize(HP* hp)
{
	assert(hp);
	return hp->size;
}
void HeapPop(HP* hp)
{
	assert(hp);
	assert(!HeapEmpty(hp));
	Swap(&hp->a[0], &hp->a[hp->size - 1]);
	hp->size--;
	AdjustDown(hp->a, hp->size, 0); // 从0开始往下调整
}

2.6 小堆实现

#include "Heap.h"
void Swap(HPDataType* px, HPDataType* py)
{
	HPDataType tmp = *px;
	*px = *py;
	*py = tmp;
}
void HeapInit(HP* hp)
{
	assert(hp);
	hp->a = NULL;
	hp->size = hp->capacaty = 0;
}
void HeapDestroy(HP* hp)
{
	assert(hp);
	free(hp->a);
	hp->size = hp->capacaty = 0;
}
void AdjustUp(int* a, int child)
{
	assert(a);
	int parent = (child - 1) / 2;
	// while (parent >= 0) 不能这样子
	while (child > 0)
	{
		if (a[child] < a[parent])
		{
			/*HPDataType tmp = a[child];
			a[child] = a[parent];
			a[parent] = tmp;*/
			Swap(&a[child], &a[parent]);
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}
void AdjustDown(int* a, int n, int parent) // n表示数组大小
{
	assert(a);
	int child = parent * 2 + 1;
	while (child < n) // 首先child要在数组范围内
	{
		// 小堆尽量用小于号，这样子换成大堆就直接小于号改为大于号
		if (child + 1 < n && a[child + 1] < a[child]) // 保证右孩子存在
		{
			child++;
		}
		if (a[child] < a[parent])
		{
			Swap(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}
void HeapPush(HP* hp, HPDataType x) 
{
	assert(hp);
	if (hp->size == hp->capacaty)
	{
		size_t newCapacity = hp->capacaty == 0 ? 4 : hp->capacaty * 2;
		HPDataType* tmp = realloc(hp->a, sizeof(HPDataType) * newCapacity);
		if (tmp == NULL)
		{
			printf("realloc fail");
			exit(-1);
		}
		hp->a = tmp;
		hp->capacaty = newCapacity;
	}
	// push要求，插入一个x后，仍然是堆
	hp->a[hp->size] = x; // 如6个数据，下标为0-5，下标为6的地方放置x
	hp->size++;
	AdjustUp(hp->a, hp->size - 1);
}
void HeapPrint(HP* hp)
{
	for (int i = 0; i < hp->size; ++i)
	{
		printf("%d ", hp->a[i]);
	}
	puts("");
}
bool HeapEmpty(HP* hp)
{
	assert(hp);
	return hp->size == 0;
}
int HeapSize(HP* hp)
{
	assert(hp);
	return hp->size;
}
void HeapPop(HP* hp)
{
	assert(hp);
	assert(!HeapEmpty(hp));
	Swap(&hp->a[0], &hp->a[hp->size - 1]);
	hp->size--;
	AdjustDown(hp->a, hp->size, 0); // 从0开始往下调整
}

三、堆的应用

3.1 Top-k问题

在N个数中，找出最大/小的前K个数
方式1：先排序，找前K个（NlogN）——要排序所有
方式2：将N个数依次插入堆中（O（N）复杂度），PopK次（O（KlogN）复杂度），每次取堆顶的数据就是前K个（O（N+KlogN）)一般K是远小于N的
向上/向下调整（完全二叉树的高度次）的时间复杂度为：[log2N，log2(N+1)]看为logN
方式3：假设N非常大，N是十亿，内存中存不下，他们存放在文件中，K是100（此时方式1和方式2都不能用了，前者都是放在数组中，10亿个整数大约占41G = 4G空间）
1G = 1024MB = 1024 * 1024KB = 1024 * 1024 * 1024Byte = 10亿字节左右
方法：

1、 用前K个数建立一个K个数的小堆
2、 剩下的N-K个数依次与堆顶的数据进行比较，如果比堆顶的数据大，就替换掉堆顶的数据，再向下调整
3、 最后堆里面K个数就是最大的K个数

时间复杂度：
建立一个K个数的堆：O（K）
剩下N-K个数进行向下调整高度次（logK）
O（K + （N - K）logK）约为 O（NlogK）

void PrintTopK(int* a, int n, int k)
{
	HP hp;
	HeapInit(&hp);
	// 创建一个K个数的小堆
	for (int i = 0; i < k; ++i)
	{
		HeapPush(&hp, a[i]);
	}
	// 剩下N-K个数依次与堆顶比较，比他大，就替换他进堆
	for (int i = k; i < n; ++i)
	{
		if (a[i] > HeapTop(&hp))
		{
			HeapPop(&hp);
			HeapPush(&hp, a[i]); 
			/*hp.a[0] = a[i];
			AdjustDown(hp.a, hp.size, 0);*/
		}
	}
	HeapPrint(&hp);
	HeapDestroy(&hp); 
}

3.2 堆排序

3.2.1 向上调整算法构建堆

思路：数组中第一个元素a[0]先看做是堆中的元素，后面的元素依次加入堆中，然后向上调整，构建堆

for (int i = 1; i < n; ++i)
	{
		AdjustUp(a, i);
	}

3.2.2 向下调整算法构建堆

向下调整算法前提：左右子树都必须是堆
思路：
叶子结点不需要调整，因为本身就是一个堆
从第一个非叶子结点（最后一个节点的父亲）开始

如图：构建小堆
在这里插入图片描述
第一个非叶子结点的是15,15＜69，不需要调整…
直到30,30<10，需要向下调整

	for (int i = ((n - 1) - 1) / 2; i >= 0; --i) // n-1是下标，(n-1-1)/2是父亲
	{
		// 向下调整算法
		AdjustDown(a, n, i);
	}

3.2.3 将数组排升序建大小堆分析

将数组升序排列，构建大堆还是小堆？

建小堆分析：
1、选出最小的数，放到第一个位置
2、如何选出次小的数？——从剩下的位置开始，剩下的数看做一个堆，但这样，之前建立的堆关系全乱了，只能重新建堆，才能选出次小的数
建堆时间复杂度为O（N），这样子之后，时间复杂度为： N N-1 N-2 … N*N
因此，建小堆排升序是可以的，但是效率太低，没有体现出堆的优势

建大堆分析：
1、建大堆，选出最大的数
2、最大的数和最后一个数进行交换
3、如何选出次小的数？——把最后一个数不看做是堆里的，然后进行向下调整算法，就可以选出次小的数了，依次类推，在重复上述过程
在这里插入图片描述
排升序

void HeapSort(int* a, int n)
{
	assert(a);
	// 建立堆O(N)
	for (int i = ((n - 1) - 1) / 2; i >= 0; --i) // n-1是下标，(n-1-1)/2是父亲
	{
		// 向下调整算法
		AdjustDown(a, n, i);
	}
	// 调整堆O(N*logN)——选数N，调整logN
	// 排升序用大堆还是小堆？
	for (int end = n - 1; end > 0; --end)
	{
		Swap(&a[end], &a[0]);
		// 调整堆，选出次小的数
		AdjustDown(a, end, 0); // end是最后一个元素的下标，也是数据元素个数
	}
}