All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
Solution: Create a dictionary to store any sequence with occurrence times.
Running Time: O(n)
class Solution:
# @param {string} s
# @return {string[]}
def findRepeatedDnaSequences(self, s):
resultDict = {}
length = len(s)
i = 0
while i <= len(s) - 10:
if s[i: i+10] not in resultDict:
resultDict[s[i:i+10]] = 1
else:
resultDict[s[i:i+10]] += 1
i += 1
return [k for k, v in resultDict.iteritems() if v >= 2]