面试题 16.02. 单词频率(C++)
设计一个方法,找出任意指定单词在一本书中的出现频率。
你的实现应该支持如下操作:
WordsFrequency(book)构造函数,参数为字符串数组构成的一本书
get(word)查询指定单词在书中出现的频率
示例:
WordsFrequency wordsFrequency = new WordsFrequency({“i”, “have”, “an”, “apple”, “he”, “have”, “a”, “pen”});
wordsFrequency.get(“you”); //返回0,"you"没有出现过
wordsFrequency.get(“have”); //返回2,"have"出现2次
wordsFrequency.get(“an”); //返回1
wordsFrequency.get(“apple”); //返回1
wordsFrequency.get(“pen”); //返回1
一开始,我想,这不简单吗,直接vector循环查找即可,暴力法
暴力破解(失败)
class WordsFrequency {
public:
vector<string> mybook;
WordsFrequency(vector<string>& book) {
this->mybook = book;
}
int get(string word) {
int times = 0;
for (vector<string>::iterator it = this->mybook.begin(); it != this->mybook.end(); it++) {
if(*it == word)times++;
}
return times;
}
};
/**
* Your WordsFrequency object will be instantiated and called as such:
* WordsFrequency* obj = new WordsFrequency(book);
* int param_1 = obj->get(word);
*/
然后当测试样例堆积的时候,就超出时间限制了…
于是想起来,查找肯定用map啊,于是乎第二份代码
map实现
class WordsFrequency {
public:
map<string,int> map;
WordsFrequency(vector<string>& book) {
for(string str:book)
map[str]++;
}
int get(string word) {
return map[word];
}
};
/**
* Your WordsFrequency object will be instantiated and called as such:
* WordsFrequency* obj = new WordsFrequency(book);
* int param_1 = obj->get(word);
*/
unordered_map实现
在评论区发现别人并不是用map实现的,更多的是使用unordered_map,无序图?经过查阅才知道,map和unordered_map差不多,但是底层的实现不一样,map是用红黑树实现的,而unordered_map是用哈希表实现的。由于此题并无要求有序,所以理论上查找用哈希表更快。(至于为什么不用hash_map,他们两个的内部结构都是采用哈希表来实现。区别在哪里?unordered_map在C++11的时候被引入标准库了,而hash_map没有,所以建议还是使用unordered_map比较好。)
class WordsFrequency {
public:
unordered_map<string,int> umap;
WordsFrequency(vector<string>& book) {
for(string str:book)
umap[str]++;
}
int get(string word) {
return umap[word];
}
};
/**
* Your WordsFrequency object will be instantiated and called as such:
* WordsFrequency* obj = new WordsFrequency(book);
* int param_1 = obj->get(word);
*/
速度明显提高,但是内存占用也提高了,原因:两者的内存占有率的问题就转化成 红黑树 VS hash表 , 总体上还是unorder_map占用的内存要高。
字典树实现
参考字典树解法
// 字典树的结点
struct trie {
int n;
trie* son[26];
trie (): n(0) {
for (int i = 0; i < 26; ++i) {
son[i] = nullptr;
}
}
};
class WordsFrequency {
private:
trie* root;
public:
WordsFrequency(vector<string>& book) {
root = new trie();
trie* tmp = root;
for (auto& i : book) {
tmp = root;
for (auto& ch : i) {
int next = ch - 'a';
if (!tmp->son[next]) {
tmp->son[next] = new trie();
}
tmp = tmp->son[next];
}
// 到达底部后, 将叶子结点++
++tmp->n;
}
}
int get(string word) {
trie* find = root;
for (auto& i : word) {
int next = i - 'a';
if (find->son[next]) find = find->son[next];
else return 0;
}
return find->n;
}
};
/**
* Your WordsFrequency object will be instantiated and called as such:
* WordsFrequency* obj = new WordsFrequency(book);
* int param_1 = obj->get(word);
*/
虽然时间和空间都很低,但也是一个好思路。
补充说明
auto是C++11的特性,可以自动为变量选择匹配的类型。
- for (auto i : v) :修改i,v中元素不会改变
- for (auto &i : v):修改i,v中元素会改变