692. Top K Frequent Words


Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:

Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]

Explanation: "i" and "love" are the two most frequent words.
    Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
Output: ["the", "is", "sunny", "day"]

Explanation: "the", "is", "sunny" and "day" are the four most frequent words,
    with the number of occurrence being 4, 3, 2 and 1 respectively.

Note:
You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.

Follow up:
Try to solve it in O(n log k) time and O(n) extra space.

方法1: priority_queue

思路:

如果按照常规的HashMap + PriorityQueue 需要O(nlogn),不符合题目要求。优化成k个元素的PriorityQueue, 在放入O(n)个元素的过程中不断更替掉低频词而达到O(nlogk)

在这里插入图片描述

易错点:

  1. 要会写c++中的comparator。注意上文中的最后一段。因此priority_queue默认的comparator是std::less,从小到大排序但是先pop最大值。模仿这个写法,就应该是1. a.freq < b.freq, 2. a.word > b.word. 这样会优先词频,并在词频相等时先输出字母序靠前的单词。
  2. 但是这道题由于我们想在推堆的过程住不断替换掉最小值,所以应该用的是完全相反来实现让词频小且字母序靠后的在最前pop。也就是comparator定义为 1. a.freq > b.freq, 2. a.word < b.word。
  3. cmp的function定义之后需要有;
  4. priority的定义语法
  5. 最后还需要reverse一次
class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        vector<string> result;
        unordered_map<string, int> freqMap;
        for (string word: words) {
            ++freqMap[word];
        }
                
        // 这里first是word, second是词频
        // true的结果是a排在前面
        auto cmp = [](pair<string, int> & a, pair<string, int> & b){
            if (a.second == b.second){
                return a.first < b.first;
            }
            return a.second > b.second;
            
        };
        
        priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp)> freqHeap(cmp);
        for (auto item: freqMap) {
            freqHeap.push(item);
            if (freqHeap.size() > k){
                freqHeap.pop();
            }
        }
        
        for (int i = 0; i < k; i++){
            result.push_back(freqHeap.top().first);
            freqHeap.pop();
        }
        //先pop出来的小词频现在在前面,需要reverse
        std::reverse(result.begin(), result.end());
        return result;
    }
};

方法2: bucket-sort

用map<int, set<string>>的结构来建立bucket,{freq: {word1, word2…}}。如果使用ordered_map, which is default type of map (in contrast to unordered_map), 就可以直接从map中取最大的词直至取满k个。map中的相同词频的用set<string>来储存,就可以自动排字母序,每次从后面开始取。

易错点:

  1. 从bucket中取,和从bucket中的second,也就是vector<\string> 当中取都应该是reverse。用到rbegin(), rend()
  2. while (!tmp2.empty() && k > 0 ) 这里内部也需要限制k。可以有一种整体判断的方式避免遗忘这个条件:每次先计算当前取出bucket的大小,如果累计不超过k全体推入,如果超过有限制的筛选(grandyang)

在这里插入图片描述


class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        vector<string> result;
        unordered_map<string, int> freqMap;
        map<int, set<string>> bucket;
        for (string word: words) {
            ++freqMap[word];
        }
        
        for (auto item: freqMap){
            bucket[item.second].insert(item.first);
        }
        
        for (auto it = bucket.rbegin(); it != bucket.rend(); it++){
            if (k <= 0 ) break;
            set<string> tmp = it->second;
            vector<string> tmp2(tmp.rbegin(), tmp.rend());
            while (!tmp2.empty() && k > 0 ){
                string a = tmp2.back() ;
                tmp2.pop_back();
                result.push_back(a);
                k--;
            }
        }
        return result;
    }
};

grandyang

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        vector<string> res;
        unordered_map<string, int> freq;
        vector<set<string>> v(words.size() + 1, set<string>());
        for (string word : words) ++freq[word];
        for (auto a : freq) {
            v[a.second].insert(a.first);
        }
        for (int i = v.size() - 1; i >= 0; --i) {
            if (k <= 0) break;
            vector<string> t(v[i].begin(), v[i].end());
            if (k >= t.size()) {
                res.insert(res.end(), t.begin(), t.end());
            } else {
                res.insert(res.end(), t.begin(), t.begin() + k);
            }
            k -= t.size();
        }
        return res;
    }
};
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值