题目:Repeated DNA Sequences
难度:MEDIUM
问题描述:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
解题思路:使用一个Set1保存所有未曾出现过的str,使用另一个Set2保存已经出现在Set1中的str。
代码如下:
public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
HashSet<String> first = new HashSet<>();
HashSet<String> second = new HashSet<>();
List<String> list = new ArrayList<>();
for(int i=0;i<s.length()-9;i++){
String str = s.substring(i,i+10);
if(first.contains(str)){
second.add(str);
}else{
first.add(str);
}
}
for(String w:second){
list.add(w);
}
return list;
}
}