Repeated DNA Sequences (Java)

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

不会写,参考网上的做法,基本都是将字符串转换成数字保存,然后放入哈希表中进行判断。

Source1 (MTL了)

    public List<String> findRepeatedDnaSequences(String s) {
        List<String> res = new ArrayList<String>();
        if(s.length() <= 10) return res;
        
        int[] a = new int['T' + 1]; //数组开到ASCII中'T'+1的位置
        char[] b = {'A', 'C', 'G', 'T'};
        a['A'] = 0; 
        a['C'] = 1;
        a['G'] = 2;
        a['T'] = 3;
        
        HashMap<Long, Integer> hm = new HashMap<Long, Integer>();  //Long不是long
        
        for(int i = 0; i < s.length() - 9; i++){
        	long sum = 0;
        	for(int j = i + 9; j >= i; j--){
        		sum += a[s.charAt(j)] * Math.pow(10, i + 9 - j);
        	}

        	if(!hm.containsKey(sum)){
        		hm.put(sum, 1);
        	}
        	else{        	
        		if(hm.get(sum) == 1){
        			String temp = new String();
        			for(int j = 9; j >= 0; j--){
        				int k = (int)(sum % 10);
        				char c = b[k];
        				sum /= 10;
        				temp += c;
        			}
        			res.add(temp);
        		}
        		else hm.put(sum, hm.get(sum) + 1);
        	}
        		
        }
        return res;
    
    }
    


    

Test

    public static void main(String[] args){
    	String s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"; 
    
    	System.out.println(new Solution().findRepeatedDnaSequences(s));
    }


Source2

    public List<String> findRepeatedDnaSequences(String s) {
    	HashSet<Integer> a = new HashSet<>();
    	HashSet<Integer> b = new HashSet<>();
    	List<String> res = new ArrayList<>();
    	char[] map = new char[26];
    	map['C' - 'A'] = 1;
    	map['G' - 'A'] = 2;
    	map['T' - 'A'] = 3;
    	
    	for(int i = 0; i < s.length() - 9; i++){
    		int sum = 0;
    		for(int j = i; j < i + 10; j++){
    			sum <<= 2; //因为map中有2,3都是两位,所以一次sum运算要移两位
    			sum |= map[s.charAt(j) - 'A'];
    		}
    		if(!a.add(sum) && b.add(sum)){ //***非常巧妙,!a.add(sum)保证多于一次的返回true,即出现两次及以上时返回true,b.add(sum)保证只有第二次加入res,不重复加入
    			//hashset是不允许重复的,如果重复的话,add方法会返回false
    			res.add(s.substring(i, i + 10));
    		}
    	}
    	return res;
    }



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值