[LeetCode]819. Most Common Word
题目描述
Given a paragraph and a list of banned words, return the most frequent word that is not in the list of banned words. It is guaranteed there is at least one word that isn’t banned, and that the answer is unique.
Words in the list of banned words are given in lowercase, and free of punctuation. Words in the paragraph are not case sensitive. The answer is in lowercase.
Example:
Input:
paragraph = "Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"]
Output: "ball"
Explanation:
"hit" occurs 3 times, but it is a banned word.
"ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph.
Note that words in the paragraph are not case sensitive,
that punctuation is ignored (even if adjacent to words, such as "ball,"),
and that "hit" isn't the answer even though it occurs more because it is banned.
一段字符串中包含标点符号,要求找到在字符串中出现次数最多且没有出现在给定集合中的单词,忽略大小写
C++ 实现
class Solution {
public:
string mostCommonWord(string paragraph, vector<string>& banned) {
unordered_set<char> punctuations = {'!','?',',','\'',';','.'};
//遍历字符串,删除标点符号并且将大写字母转小写
for (auto it = paragraph.begin(); it != paragraph.end();)
{
if (punctuations.count(*it))
{
paragraph.erase(it);
}
else
{
*it = tolower(*it);
++it;
}
}
unordered_set<string> banned_map(banned.begin(),banned.end());
unordered_map<string,uint32_t> word_count_map;
stringstream ss(paragraph);
string word;
// 使用stringstream 实现字符串split,
// 如果有boost 库可以直接使用boost的split
while(getline(ss,word,' '))
{
if (!banned_map.count(word))
{
word_count_map[word]++;
}
}
// 使用标准库中max_element函数查找出现次数对多的word, 函数前两个参数是迭代区间,第三个参数是比较器(使用匿名函数实现)
return max_element(word_count_map.begin(),word_count_map.end(),
[](const auto& pair1,const auto& pair2){
return pair1.second < pair2.second;
})->first;
}
};