题目:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
思路:
不知道这道题目为什么难度被定义为了medium。其实好像也没有用到medium级别的算法,我们只是定义了一张哈希表,然后扫描数组,在哈希表里面查找即可。这样的算法也通过了。
代码:
class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
vector<string> ret;
if(s.size() < 10) {
return ret;
}
unordered_map<string, int> hash;
for(int i = 0; i < s.size() - 9; ++i) {
string str = s.substr(i, 10);
if(hash.count(str) && hash[str] < 2) {
ret.push_back(str);
}
hash[str]++;
}
return ret;
}
};