Leetcode 347:前K个高频元素（最详细解决方案！！！）

最新推荐文章于 2024-08-13 09:00:00 发布

coordinate_blog

最新推荐文章于 2024-08-13 09:00:00 发布

阅读量7.1k

点赞数 3

分类专栏： Problems leetcode解题指南文章标签： leetcode

本文链接：https://blog.csdn.net/qq_17550379/article/details/80957793

版权

Problems 同时被 2 个专栏收录

699 篇文章 42 订阅

订阅专栏

leetcode解题指南

654 篇文章 240 订阅

订阅专栏

给定一个非空的整数数组，返回其中出现频率前 k 高的元素。

例如，

给定数组 [1,1,1,2,2,3] , 和 k = 2，返回 [1,2]。

注意：

你可以假设给定的 k 总是合理的，1 ≤ k ≤ 数组中不相同的元素的个数。
你的算法的时间复杂度必须优于 O(n log n) , n 是数组的大小。

解题思路

这个问题的思路很简单，首先建立一个木桶数组，遍历目标数组，将数据放到木桶数组中

1 : 1 1 1
2 : 2 2
3 : 3

我这里的木桶数组是通过dict实现的

class Solution:
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """
        count_list = dict()
        result = list()
        for i in nums:
            count_list[i] = count_list.get(i, 0) + 1
        t = sorted(count_list.items(), key=lambda l : l[1] ,reverse=True)
        for i in range(k):
            result.append(t[i][0])

        return result

但是这种做法的时间复杂度是O(nlog(n))，并没有达到题目的要求。那么我们还有什么办法，我们可以使用一个priority_queue来存储这k个元素，通过遍历数组，比较数组中元素出现的频率，然后实现入队，出队操作。

class Solution:
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """   
        from queue import PriorityQueue
        count_list = dict()
        for i in nums:
            count_list[i] = count_list.get(i, 0) + 1

        p = PriorityQueue()
        for i in count_list.items():
            if p.qsize() == k:# 判断优先队列长度是否满足k
                if i[1] > p[0]:# bug
                    p.get()
                    p.put((i[1], i[0]))# 通过 (频率,元素) 形式存储
            else:
                p.put((i[1], i[0]))

        result = list()
        while not p.empty():
            _, v = p.get()
            result.append(v)

        return result

如你所见，上述代码是有问题的，我们无法通过index访问PriorityQueue的元素。所以我们只能换个方法，可以使用heapq代替。

class Solution:
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """   
        import heapq
        count_list = dict()
        for i in nums:
            count_list[i] = count_list.get(i, 0) + 1

        p = list()
        for i in count_list.items():
            if len(p) == k:
                if i[1] > p[0][0]:
                    heapq.heappop(p)
                    heapq.heappush(p, (i[1], i[0]))
            else:
                heapq.heappush(p, (i[1], i[0]))

        return [i[1] for i in p]

我们现在这个算法的时间复杂度变成了O(nlog(k))。但是既然我们使用了heapq，那么我们还有一种更加简洁的写法。

class Solution:
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """   
        import heapq
        count_list = dict()
        for i in nums:
            count_list[i] = count_list.get(i, 0) + 1

        p = list()
        for i in count_list.items():
            heapq.heappush(p, (i[1], i[0]))

        return [i[1] for i in heapq.nlargest(k, p)]

但是这种写法的时间复杂度比上面那种要高，因为我们在维护堆的时候，不再是维护k大小的堆，而是维护n大小的堆。

这个算法还有缺陷，当我们的n == k的时候，算法的时间复杂度就和第一种做法是一样的了，还有没有更好地解法？

我们可以把这个问题分开考虑，对与n > k，我们依旧使用上述的算法，而对于n < k，我们就要想有什么好的解法了。其实也很简单，我们要清楚我们的PriorityQueue中装的是什么，我们通过维护一个n - k大小的PriorityQueue就可以实现。

我们想到的做法就是，维护一个n - k大小的最大堆，当进来一个元素，我们就将最大的那个元素弹出到result中即可。但是在python里面我们的heapq没法像c++中那样可以修改比较函数，我们可以怎么做呢？

class Solution:
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """   
        import heapq
        count_list = dict()
        for i in nums:
            count_list[i] = count_list.get(i, 0) + 1
        
        p = list()
        result = list()
        for i in count_list.items():
            heapq.heappush(p, (-i[1], -i[0]))# tricks
            if len(p) > len(count_list) - k:
                _, val = heapq.heappop(p)
                result.append(-val)

        return result

一个更pythonic的解法

class Solution:
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """
        from collections import Counter
        return [item[0] for item in Counter(nums).most_common(k)]