说一道排序题

最新推荐文章于 2020-11-21 01:33:53 发布

Python爬虫与算法进阶

最新推荐文章于 2020-11-21 01:33:53 发布

阅读量190

点赞数 1

本文链接：https://blog.csdn.net/m0_37148637/article/details/103461343

版权

很经典的排序问题 (๑• . •๑)

先看题目，“前K个高频元素”

给定一个非空的整数数组，返回其中出现频率前 k 高的元素。

示例 1:

输入: nums = [1,1,1,2,2,3], k = 2
输出: [1,2]

示例 2:

输入: nums = [1], k = 1
输出: [1]

简单解法

这题很简单，两步：

用字典保存数字及其出现的对应频率
排序

那么第一步就不用说了，很简单

m = dict()
for num in nums:
    if num in m:
        m[num] += 1
    else:
        m[num] = 1

有趣的就在第二步，排序。很多人都是使用内置库sorted

class Solution(object):
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """
        m = dict()
        for num in nums:
            if num in m:
                m[num] += 1
            else:
                m[num] = 1

        output = sorted(m.items(), key=lambda e: e[1], reverse=True)

        final = []
        for i in range(k):
            final.append(output[i][0])
        return final

最简洁的代码是直接使用Python内置的collections

class Solution(object):
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """
        import collections
        counter = collections.Counter(nums)
        return [item[0] for item in counter.most_common(k)]

这真是是我们想要的吗？并不是。我们的目标是不使用任何内置库。

sorted原理

关于Python的sorted排序算法，这篇文章讲的比较详细：python sort函数内部实现原理，说到Python使用的是著名的Timesort算法。

Timsort是结合了合并排序（merge sort）和插入排序（insertion sort）而得出的排序算法，它在现实中有很好的效率。

Tim Peters在2002年设计了该算法并在Python中使用（TimSort 是 Python 中 list.sort 的默认实现）。该算法找到数据中已经排好序的块-分区，每一个分区叫一个run，然后按规则合并这些run。Pyhton自从2.3版以来一直采用Timsort算法排序，现在Java SE7和Android也采用Timsort算法对数组排序。

如果想自己用Python来写一个排序算法，完成本题要求该如何写？也就是对这个字典进行排序，{5: 1, 1: 3, 4: 1, 2: 2, 3: 1}，有什么好办法。

思路可以是两个指针遍历字典，如果左边大于右边，则替换位置。

更好的方法

class Solution(object):
    def topKFrequent(self, nums, k):
        """
        :type nums: List[int]
        :type k: int
        :rtype: List[int]
        """
        m = dict()
        for num in nums:
            if num in m:
                m[num] += 1
            else:
                m[num] = 1
        bucket = [[] for _ in range(len(nums) + 1)]

        for key, value in m.items():
            bucket[value].append(key)

        result = []
        for i in range(len(nums), 0, -1):
            for n in bucket[i]:
                result.append(n)
                if len(result) == k:
                    return result
        return result