All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
由于英文差劲,真的读了半天才明白题意思。
就是查找DNA分子中所有10个字母长的字串且字串不止出现一次。
附leetcode中StefanPochmann的解。用两个hashset,简单易懂。
public List<String> findRepeatedDnaSequences(String s) {
Set seen = new HashSet(), repeated = new HashSet();
for (int i = 0; i + 9 < s.length(); i++) {
String ten = s.substring(i, i + 10);
if (!seen.add(ten)) {
repeated.add(ten);
}
}
// System.out.println(seen.toString());
return new ArrayList(repeated);
}