All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
二进制表示+哈希表
表示十个字母需要30位二进制数,开2^30的哈希表记录
class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
vector<string> str;
map<int,int> m;
int i,cur=0;
for(i=0;i<10;i++){
cur <<= 3;
cur |= (s[i] & 7);
}
m[cur] = 1;
for(i=10;i<s.size();i++){
cur <<= 3;
cur |= (s[i] & 7);
cur &= 0x3fffffff;
if(m.find(cur)!=m.end()){
if(m[cur] == 1) str.push_back(s.substr(i-9,10));
m[cur] ++;
}
else{
m[cur] = 1;
}
}
return str;
}
};