# LeetCode Repeated DNA Sequences

175人阅读 评论(0)

Description:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

Solution:

import java.util.*;

public class Solution {
public List<String> findRepeatedDnaSequences(String s) {
List<String> list = new ArrayList<String>();
TreeMap<Integer, Integer> map = new TreeMap<Integer, Integer>();

if (s.length() < 10)
return list;

int temp = 0, num;
for (int i = 0; i < 9; i++) {
temp = temp << 2 | convert(s.charAt(i));
}
for (int i = 9; i < s.length(); i++) {
temp = (temp << 2 | convert(s.charAt(i))) & 0xFFFFF;
if (map.containsKey(temp)) {
num = map.get(temp);
map.put(temp, num + 1);
} else
map.put(temp, 1);
}

String neo;
Iterator<Integer> ite = map.keySet().iterator();
while (ite.hasNext()) {
temp = ite.next();
num = map.get(temp);
if (num == 1)
continue;
neo = "";
for (int i = 0; i < 10; i++) {
neo = (char) convert(temp % 4) + neo;
temp >>= 2;
}
}

return list;
}

int convert(int ch) {
switch (ch) {
case 'A':
return 0;
case 'C':
return 1;
case 'G':
return 2;
case 'T':
return 3;
case 0:
return 'A';
case 1:
return 'C';
case 2:
return 'G';
case 3:
return 'T';
}
return 0;
}
}

0
0

* 以上用户言论只代表其个人观点，不代表CSDN网站的观点或立场
个人资料
• 访问：67205次
• 积分：3299
• 等级：
• 排名：第10607名
• 原创：288篇
• 转载：0篇
• 译文：0篇
• 评论：4条
文章分类
阅读排行
最新评论