算法-leetcode-前K个高频单词

最新推荐文章于 2021-05-21 17:41:31 发布

迷路剑客

最新推荐文章于 2021-05-21 17:41:31 发布

阅读量740

点赞数

分类专栏：算法

本文链接：https://blog.csdn.net/baichoufei90/article/details/104554015

版权

算法专栏收录该内容

85 篇文章 6 订阅

订阅专栏

算法-leetcode-前K个高频单词

1 概述

1.1 题目出处

https://leetcode-cn.com/problems/top-k-frequent-words/

1.2 题目描述

给一非空的单词列表，返回前 k 个出现次数最多的单词。

返回的答案应该按单词出现频率由高到低排序。如果不同的单词有相同出现频率，按字母顺序排序。

示例 1：
输入: [“i”, “love”, “leetcode”, “i”, “love”, “coding”], k = 2
输出: [“i”, “love”]
解析: “i” 和 “love” 为出现次数最多的两个单词，均为2次。
注意，按字母顺序 “i” 在 “love” 之前。
示例 2：
输入: [“the”, “day”, “is”, “sunny”, “the”, “the”, “the”, “sunny”, “is”, “is”], k = 4
输出: [“the”, “is”, “sunny”, “day”]
解析: “the”, “is”, “sunny” 和 “day” 是出现次数最多的四个单词，
出现次数依次为 4, 3, 2 和 1 次。
注意：
假定 k 总为有效值， 1 ≤ k ≤ 集合元素数。
输入的单词均由小写字母组成。
扩展练习：
尝试以 O(n log k) 时间复杂度和 O(n) 空间复杂度解决。

2 题解

2.1 解题思路

使用Map统计词频，并构建两个List以相同顺序分别存放词频和对应字符串；
构建最小堆，根据词频同步调整两个List；
注意比较两个单词词频的时候，如果词频相等，还需要根据单词字母顺序比较。
采用堆排序，随后输出已排序堆即可

2.2 代码

class Solution {
    public List<String> topKFrequent(String[] words, int k) {
        // 统计词频Map
        Map<String,Integer> countMap = new HashMap(words.length);

        // 开始词频统计
        for(String word : words){
            Integer cnt = countMap.get(word);
            if(cnt == null){
                countMap.put(word,1);
            }else{
                countMap.put(word,cnt+1);
            }
        }

        // 存储词频
        List<Integer> countNums = new ArrayList(countMap.entrySet().size());

        // 存储去重后单词，顺序和原始词频数组一致
        List<String> words2 = new ArrayList(countMap.entrySet().size());

        // 遍历统计词频Map，填充词频和单词数组
        Iterator<Map.Entry<String, Integer>> iterator = countMap.entrySet().iterator();
        while(iterator.hasNext()){
            Map.Entry<String,Integer> entry = iterator.next();
            words2.add(entry.getKey());
            countNums.add(entry.getValue());
        } 
        
        // 构建大小为k的最小堆，并调整前k个直接放入的元素
        int h = (k-1)/2;
        for(int l = h; l >=0; l--){
            adjustMinHeap(countNums,l,k,words2);
        }

        // 从k+1个元素开始一次和堆顶比较
        // 如果比堆顶元素还大，就交换并开始从堆顶调整堆
        for(int l = k;l<countNums.size();l++){
            if(countNums.get(l) > countNums.get(0) || 
                    (countNums.get(l).equals(countNums.get(0)) && leftGreater(words2.get(l),words2.get(0)))){
                countNums.set(0,countNums.get(l));
                words2.set(0,words2.get(l));
                adjustMinHeap(countNums,0,k,words2);
            }
        }

        // 将堆顶最小元素放到堆末尾，并排除堆尾部已排序元素，再从堆顶开始做堆排序
        for(int l = k-1;l>=0;l--){
            int tmp = countNums.get(0);
            String tmpS = words2.get(0);
            countNums.set(0, countNums.get(l));
            words2.set(0, words2.get(l));
            countNums.set(l, tmp);
            words2.set(l, tmpS);
            adjustMinHeap(countNums,0,l,words2);
        }
        return words2.subList(0,k);
    }
    
    public boolean leftGreater(String left,String right){
        int length = Math.min(left.length(),right.length());
        for(int i = 0; i < length; i++){
            char l = left.charAt(i);
            char r = right.charAt(i);
            if(l < r){
                return true;
            }else if (l > r){
                return false;
            }
        }
        return left.length() < right.length();
    }
    //从指定start位置往下调整
    public void adjustMinHeap(List<Integer> nums, int start, int length, List<String> words2){
        int tmp = nums.get(start);
        String tmpS = words2.get(start);
        for(int j = start*2+1;j<length;j=j*2+1){
            if(j+1<length){
                if(nums.get(j) > nums.get(j+1)){
                    j = j+1;
                }else if(nums.get(j) == nums.get(j+1)){
                    j = leftGreater(words2.get(j),words2.get(j+1)) ? j+1 : j;
                }
            }
            if(tmp > nums.get(j)){
                nums.set(start,nums.get(j));
                words2.set(start,words2.get(j));
                start = j;
            }else if(tmp == nums.get(j)){
                if(leftGreater(tmpS,words2.get(j))){
                    nums.set(start,nums.get(j));
                    words2.set(start,words2.get(j));
                    start = j;
                }else{
                    break;
                }
            }else{
                break;
            }
        }
        nums.set(start,tmp);
        words2.set(start,tmpS);
    }
}

2.3 时间复杂度

O(nlogk)

初始词频统计
O(n)
构建两个List
O(n)
调整堆n次
O(nlog(k)
前k个元素堆排序
O(klog(k))

2.4 空间复杂度

O(n)

词频统计Map
O(n)
存放词频和单词List
O(n)
堆
O(k)

2.5 注意事项

不要使用
Arrays.sort(arr) Arrays.copyOf(arr, k)、PriorityQueue等方法或类，
面试官不会满意，同时有可能自己面试时可能忘记函数名。

迷路剑客

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
算法-leetcode-前K个高频单词

算法-leetcode-最小K个数且排序输出1 概述1.1 题目出处https://leetcode-cn.com/problems/top-k-frequent-words/1.2 题目描述给一非空的单词列表，返回前 k 个出现次数最多的单词。返回的答案应该按单词出现频率由高到低排序。如果不同的单词有相同出现频率，按字母顺序排序。示例 1：输入: [“i”, “love”, ...
复制链接

扫一扫