Leetcode - Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

[分析]
此题思路是容易想到的,遍历输入字符串的每个长度为10的substring,利用HashMap 检查其出现次数,出现两次或者以上的则加入到结果中。
实现时仅当某个substring第二次出现时加入结果可避免结果中出现重复字符串。但直接实现会得到Memory Limit Exceed,就是程序内存开销太大了。
此题的关键就是要将那些待检查的substring转换为int来节省内存,如何高效的编码substring?共4个字符,ACGT,可用两个bit区分它们,分别是00,01,10,11,
参考解答中的掩码技巧值得学习,使用一个20位的数字0x3ffff称为eraser,每次要更新一位字符时,将老的编码hint & eraser, 然后左移两位,然后加上新字符对应的编码,
这样就得到了新substring的编码,很巧妙~

[ref]
[url]http://blog.csdn.net/coderhuhy/article/details/43647731[/url]


public class Solution {
// Method 2: hashmap store int instead of string to bypass MLE
public static final int eraser = 0x3ffff;
public static HashMap<Character, Integer> ati = new HashMap<Character, Integer>();
static {
ati.put('A', 0);
ati.put('C', 1);
ati.put('G', 2);
ati.put('T', 3);
}
public List<String> findRepeatedDnaSequences(String s) {
List<String> result = new ArrayList<String>();
if (s == null || s.length() <= 10)
return result;
int N = s.length();
int hint = 0;
for (int i = 0; i < 10; i++) {
hint = (hint << 2) + ati.get(s.charAt(i));
}
HashMap<Integer, Integer> checker = new HashMap<Integer, Integer>();
checker.put(hint, 1);
for (int i = 10; i < N; i++) {
hint = ((hint & eraser) << 2) + ati.get(s.charAt(i));
Integer value = checker.get(hint);
if (value == null) {
checker.put(hint, 1);
} else if (value == 1) {
checker.put(hint, value + 1);
result.add(s.substring(i - 9, i + 1));
}
}
return result;
}
// Method 1: Memory Limit Exceed & may contain duplicates
public List<String> findRepeatedDnaSequences1(String s) {
HashMap<String, Integer> map = new HashMap<String, Integer>();
int last = s.length() - 10;
for (int i = 0; i <= last; i++) {
String key = s.substring(i, i + 10);
if (map.containsKey(key)) {
map.put(key, map.get(key) + 1);
} else {
map.put(key, 1);
}
}
List<String> result = new ArrayList<String>();
for (String key : map.keySet()) {
if (map.get(key) > 1)
result.add(key);
}
return result;
}
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值