LeetCode 187 Repeated DNA Sequences

最新推荐文章于 2024-08-08 16:37:46 发布

_我们的存在

最新推荐文章于 2024-08-08 16:37:46 发布

阅读量1k

点赞数 1

分类专栏： leetcode 文章标签： leetcode DNA repeated

本文链接：https://blog.csdn.net/Yano_nankai/article/details/50178827

版权

leetcode 专栏收录该内容

208 篇文章 0 订阅

订阅专栏

题目描述

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

分析

考察位图。按位操作，A C G T分别用如下bits表示：

所以10个连续的字符，只需要20位即可表示，而一个int（32位）就可以表示。定义变量hash，后20位表示字符串序列，其余位数置0 。

定义一个set用来存放已经出现过的hash，计算新hash时，如果已经出现过，就放入结果的set中。

代码

    public static List<String> findRepeatedDnaSequences(String s) {

        if (s == null || s.length() < 11) {
            return new ArrayList<String>();
        }

        int hash = 0;

        Set<Integer> appear = new HashSet<Integer>();
        Set<String> set = new HashSet<String>();

        Map<Character, Integer> map = new HashMap<Character, Integer>();
        map.put('A', 0);
        map.put('C', 1);
        map.put('G', 2);
        map.put('T', 3);

        for (int i = 0; i < s.length(); i++) {

            char c = s.charAt(i);

            hash = (hash << 2) + map.get(c);
            hash &= (1 << 20) - 1;

            if (i >= 9) {
                if (appear.contains(hash)) {
                    set.add(s.substring(i - 9, i + 1));
                } else {
                    appear.add(hash);
                }
            }
        }

        return new ArrayList<String>(set);
    }

_我们的存在

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
LeetCode 187 Repeated DNA Sequences

题目描述All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.Writ
复制链接

扫一扫

专栏目录