小鑫的算法之路：堆

最新推荐文章于 2024-06-14 09:55:36 发布

m0_55372289

最新推荐文章于 2024-06-14 09:55:36 发布

阅读量100

点赞数

分类专栏：数据结构和算法文章标签：算法数据结构

本文链接：https://blog.csdn.net/m0_55372289/article/details/124957697

版权

数据结构和算法专栏收录该内容

12 篇文章 0 订阅

订阅专栏

背景

队列的特点是先进先出（FIFO），在日常生活中很常见，比如说食堂中就餐的队列，基本都是谁先排队，谁就先打餐吃饭。

然而，还有其他一些场景在生活中也很常见，比如说在医院排队求诊的病人，如果来了一位严重的病人，那么就需要急诊，否者如果按照普通队列排队等待。如果其他病人求诊完，那就会危及这位病人的生命，因此这位病人需要立即就医，优先级很高。就这引出了优先队列，优先队列的特点不是先进先出，而是基于优先级出队。

对于优先队列，在数据结构上不能依赖普通队列进行实现，而需要基于堆实现，才能高效实现入队和基于优先级出队的时间复杂度均为O(logn)。

二叉堆

特点

为了实现优先队列出队和入列的O(logn)的时间复杂度，往往通过二叉堆实现。

二叉堆是一颗完全二叉树，完全二叉树中的元素按照层级依次从左到右进行排列。

二叉堆中某个节点的优先级不能高于其父节点的优先级。在最大堆中，节点的值不大于父节点的值。在最小堆中，节点的值不小于父节点的值。

存储结构

由于二叉堆是一颗完全二叉树，可通过数组而非二叉树来存储二叉堆。在数组中，如果根节点的起始索引为0，节点的索引为i，那么其父节点和子节点的索引如下：

parent(i) = (i - 1) / 2 
left_child(i) = 2 * i + 1
right_child(i) = 2 * i + 2

代码示例

在二叉堆中，最主要的操作是节点上移和节点下移操作。以最大堆为例，代码如下：

template<typename T>
class maxheap {
public:
    maxheap() {}

    maxheap(size_t capacity) 
    {
        data_.reserve(capacity);
    }

    ~maxheap() = default;

    size_t size() const 
    {
        return data_.size();
    }

    bool empty() const
    {
        return data_.empty();
    }

    void add(const T& element) 
    {
        data_.emplace_back(element);
        sift_up(data_.size() - 1);
    }

    // 注意：如果二叉堆中无数据，获取max()是未定义的
    const T& max() const
    {
        return data_.front();
    }

    void pop()
    {
        if (data_.empty()) {
            return;
        }

        std::swap(data_.front(), data_.back());
        data_.pop_back();

        sift_down(0);
    }

private:
    void sift_up(std::ptrdiff_t index)
    {
        auto parent_index = parent(index);
        // 如果当前节点大于父亲节点的值，需要进行上移操作
        while (parent_index.has_value() && parent_index.value() >= 0 && (data_[index] > data_[parent_index.value()])) {
            std::swap(data_[parent_index.value()], data_[index]);
            index = parent_index.value();
            parent_index = parent(index);   
        } 
    }

    void sift_down(std::ptrdiff_t index)
    {
        auto left_child_index = left_child(index);
        // 如果当前节点小于其左右孩子节点的最大值，需要进行下移操作
        while (left_child_index.has_value() && left_child_index.value() < data_.size()) {
            std::ptrdiff_t max_child_index = left_child_index.value();
            if ((max_child_index + 1 < data_.size()) && (data_[max_child_index] < data_[max_child_index + 1])) {
                ++max_child_index;
            }
            if (data_[index] >= data_[max_child_index]) {
                break;
            }
            std::swap(data_[max_child_index], data_[index]);
            index = max_child_index;
            left_child_index = left_child(index);
        }
    }

    std::optional<std::ptrdiff_t> parent(std::ptrdiff_t index) const
    {
        if (index <= 0) {
            return std::nullopt;
        }

        return (index - 1) / 2;
    }

    std::optional<std::ptrdiff_t> left_child(std::ptrdiff_t index) const
    {
        if (index < 0) {
            return std::nullopt;
        }

        return 2 * index + 1;
    }

    std::optional<std::ptrdiff_t> right_child(std::ptrdiff_t index) const
    {
        if (index < 0) {
            return std::nullopt;
        }

        return 2 * index + 2;
    }
   

private:
    std::vector<T> data_;  // 存储堆内部数据
};

堆排序

在最大堆中，数组的首个元素为整个数组中的最大值。根据这个性质，不断将最大值交换到数组尾部，然后将除开尾部已排序的数据进行堆化，继续将最大值交换到尾部，可以实现堆排序。

堆排序是原地排序，时间复杂度为O(nlogn)，空间复杂度为O(1)。代码如下：

template<typename T>
class heapSort {
public:
    void sort(std::vector<T>& data)
    {
        // 如果元素个数不超过1，那么无需处理
        if (data.size() <= 1u) {
            return;
        }

        heapify(data); // 将数据进行堆化处理
        for (int i = static_cast<int>(data.size()) - 1; i > 0; --i) {
            std::swap(data[i], data[0]);  // 交换最大值到数组尾部  
            sift_down(data, 0, i);  // data[0, i)区间进行堆化
        }
    }

private:
    void heapify(std::vector<T>& data)
    {
        for (int i = (static_cast<int>(data.size()) - 2) / 2; i >= 0; --i) {
            sift_down(data, i, static_cast<int>(data.size()));
        }
    }

    void sift_down(std::vector<T>& data, int index, int size)
    {
        int left_child = 2 * index + 1;
        while (left_child < size) {
            int max_child = left_child;
            if ((left_child + 1 < size) && (data[left_child] < data[left_child + 1])) {
                ++max_child;
            }
            if (data[index] >= data[max_child]) {
                break;
            }
            std::swap(data[index], data[max_child]);
            index = max_child;
            left_child = 2 * index + 1;
        }
    }
};

优先队列

堆排序应用不是很多，堆应用最多的场景是优先队列。

例如在规模很大的数据集合中，如果要找到前K个最大的元素，那么无需对所有元素进行排序（某些场景下内存不足无法支持排序），此时只需要一个能存储K个元素的优先队列即可解决问题。

m0_55372289

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
小鑫的算法之路：堆

背景队列的特点是先进先出（FIFO），在日常生活中很常见，比如说食堂中就餐的队列，基本都是谁先排队，谁就先打餐吃饭。然而，还有其他一些场景在生活中也很常见，比如说在医院排队求诊的病人，如果来了一位严重的病人，那么就需要急诊，否者如果按照普通队列排队等待。如果其他病人求诊完，那就会危及这位病人的生命，因此这位病人需要立即就医，优先级很高。就这引出了优先队列，优先队列的特点不是先进先出，而是基于优先级出队。对于优先队列，在数据结构上不能依赖普通队列进行实现，而需要基于堆实现，才能高效实现入队和基于优先级出
复制链接

扫一扫