Leetcode 187. Repeated DNA Sequences | 位存储

最新推荐文章于 2024-04-21 19:46:46 发布

Z-Pilgrim

最新推荐文章于 2024-04-21 19:46:46 发布

阅读量331

点赞数

分类专栏： LeetCode题解

本文链接：https://blog.csdn.net/u011026968/article/details/79356183

版权

LeetCode题解专栏收录该内容

271 篇文章

订阅专栏

https://leetcode.com/problems/repeated-dna-sequences/description/

这题略没意思把。

class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        unordered_map<string, int> rec;
        for (int i = 0; i + 10 <= s.size(); i++) {
            rec[s.substr(i, 10)] ++;
        }
        vector <string> ans;
        for (unordered_map<string,int>::iterator itr = rec.begin(); itr != rec.end(); itr++) {
            if (itr->second > 1 ) {
                ans.push_back(itr->first);
            }
        }
        return ans;
    }
};

Discuss的写法是考虑ATCG的ascii码不同，就可以三个位表示一个字母这样1个int能存一个字符串

https://leetcode.com/problems/repeated-dna-sequences/discuss/53877/I-did-it-in-10-lines-of-C++

The main idea is to store the substring as int in map to bypass the memory limits.

There are only four possible character A, C, G, and T, but I want to use 3 bits per letter instead of 2.

Why? It’s easier to code.

A is 0x41, C is 0x43, G is 0x47, T is 0x54. Still don’t see it? Let me write it in octal.

A is 0101, C is 0103, G is 0107, T is 0124. The last digit in octal are different for all four letters. That’s all we need!

We can simply use s[i] & 7 to get the last digit which are just the last 3 bits, it’s much easier than lookup table or switch or a bunch of if and else, right?

We don’t really need to generate the substring from the int. While counting the number of occurrences, we can push the substring into result as soon as the count becomes 2, so there won’t be any duplicates in the result.

vector<string> findRepeatedDnaSequences(string s) {
    unordered_map<int, int> m;
    vector<string> r;
    int t = 0, i = 0, ss = s.size();
    while (i < 9)
        t = t << 3 | s[i++] & 7;
    while (i < ss)
        if (m[t = t << 3 & 0x3FFFFFFF | s[i++] & 7]++ == 1)
            r.push_back(s.substr(i - 10, 10));
    return r;
}