力扣347：前K个高频元素题解

computer初学者

于 2023-03-30 21:29:32 发布

阅读量74

点赞数

文章标签： leetcode 算法数据结构

本文链接：https://blog.csdn.net/weixin_52869970/article/details/129669815

版权

力扣347：前K个高频元素

题目描述

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Example 1:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]
Example 2:

Input: nums = [1], k = 1
Output: [1]

Constraints:

1 <= nums.length <= 105
-104 <= nums[i] <= 104
k is in the range [1, the number of unique elements in the array].
It is guaranteed that the answer is unique.

Follow up: Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size.

来源：力扣（LeetCode）
链接：https://leetcode.cn/problems/top-k-frequent-elements

分析

     总体思路如下
     首先建立一个二元组表<value, freq>，分别代表值和出现次数
     遍历数组填这个表
     于是问题就转化为取这个二元表的前K个tuple（以freq为参考）的value
     对于这个问题有不同的解法
     首先想到的是对ocurrence按freq的大小进行排序
     常见的几种排序的时间复杂度基本上都是大于等于O(NlogN)的，除了桶排序，一种用空间换时间的做法

        unordered_map<int ,int> ocurrence;
        for(auto &n : nums){
            ocurrence[n]++;
        }

解法一：桶排序

因为时间复杂度要优于NlogN，所以舍弃了常规的排序方法，采用一种以空间换时间的方法：桶排序。
观察到：1 <= nums.length <= 105，也就是nums中元素的出现次数是有限的。
那么，可以设想设置多个容器（桶），以元素的出现频率freq作为桶的序号，将频率所对应的元素（字母）装入相应的桶中。
之后，从最后一桶倒序遍历桶，遍历到的第K个桶内装的就是答案。

 /*方法一：桶排序*/
        // 使用该方法需要对上面的代码略微修改一下
         unordered_map<int ,int> ocurrence;
         int maxcnt = 0;
         for(auto &n : nums){
             ocurrence[n]++;
             maxcnt = max(maxcnt, ocurrence[n]);
         }      
         vector<int> result; // 创建一个结果数组
         unordered_map<int, vector<int>> tong; // 建立一个<freq, <value1, value2, ...>>结构的表，分别代表出现的频率和对应的数字
         // 这里也可以使用vector<vector<int>>容器
         for(auto &ocur: ocurrence){
             tong[ocur.second].push_back(ocur.first);
         }
         for(int i = maxcnt; i>0 ; i--){
             if(tong[i].size() != 0){
             /*iterator insert(iterator it,const_iterator first,const_iterator last):向量中迭代器指向元素前插入另一个相同类型向量的[first,last)间的数据*/    
                 result.insert(result.end(), tong[i].begin(), tong[i].end()); // 
             }
             if(result.size() == k){
                 break;
             }
         }
         return result;

解法二：堆

关键词是“ 第K大 ”
我们可以维持一个容量为K的最小堆，堆顶最小。用于存储前K大元素。
动态地将ocurences表的元素插入到堆中。当堆未满时，直接插入；当堆已满时，比较堆顶元素top与当前待插入元素cur的值：若top > cur，舍弃cur，说明cur比这K个值中的最小值还小，说明cur不在“ 前K大 ”的行列中；若top < cur，说明堆顶不再在“ 前K大 ”的行列中，则堆顶被弹出，cur插入。
最后得到存储整个ocurrences数组的以freq为权值的前K大元素的堆。由堆的性质知，第K大元素——即这个堆中最小的元素——位于堆顶。

    static bool cmp(pair<int, int>& m, pair<int, int>& n) {
        return m.second > n.second;
    }
        /*方法二 利用堆*/
        // 关键词： 前K大元素
        // 考虑建立并维护一个容量为K的小顶堆，表示前K大元素，其维护过程如下：
        // 若堆内元素还不到K，直接加入；若元素已到K，对比待插入当前元素cur与堆顶元素min的值：
        // 若cur > min, 说明cur在前K的行列中，min应该离堆；否则cur直接被舍弃
        // 小顶堆的cpp建立方式：
        /*
            priority_queue<Type, Container, Functional>;
            Type是要存放的数据类型
            Container是实现底层堆的容器，必须是数组实现的容器，如vector、deque
            Functional是比较方式/比较函数/优先级
        */
        // 定义比较函数

    priority_queue<pair<int, int>, vector<pair<int, int>>, decltype(&cmp)> q(cmp);
    for (auto& [num, count] : ocurrence) {
        if (q.size() == k) {
            if (q.top().second < count) {
                    q.pop();
                    q.emplace(num, count);
                }
            } else {
                q.emplace(num, count);
            }
        }
    vector<int> ret;
    while (!q.empty()) {
        ret.emplace_back(q.top().first);
        q.pop();
    }
    return ret;

解法三：快速选择

其实，题目已经转化为求一个数组的前K大元素了。那么，我们可以考虑使用类似之前求数组的第K元素中使用的快速选择算法，而且这个算法的时间复杂度是满足要求的。
主要就是写一个qsort函数，其实现的功能是将前K个高频元素加入ret数组中。

void qsort(vector<pair<int, int>>& array, vector<int>& ret, int left, int right, int k){
	/*fucntion: extract the biggest K elements in array[left:right] to ret*/
	int rd =   //在left~right范围随机选取一点
	int pivot = array[rd].second; //将该元素对应的freq作为一个基准，后面将以这个基准将Array分割
	swap(array[rd], array[left]); //为了方便，先将该元素放到最左边
	int pivot_index = left; //基准点的位置变量
	for(int scanner=left+1; scanner<=right; scanner++){
		if(array[scanner].second > pivot){
			swap(array[scanner], array[pivot_index + 1]);
			pivot_index++;
		}
	}
	swap(array[left], array[pivot_index]); 
	/*now we have elements in array[left:index-1] all have bigger freq than pivot, and the counterpart in array[index+1:right] all less than pivot*/
	/*[left..........index-1, index, index+1,...............right]*/
	/*now consider the answer we are looking for */
	if(((index-1) - left + 1) >= k){
		qsort(array, ret, left, index-1, k);
	}
	else{
		for(int i=left; i<=index; i++){
			ret.emplace(array[i].first);
		}
		if(index - left + 1 < k){
			qsort(array, ret, index+1, right, k - (index - left + 1));
	}