Rosalind第25题——ros_bio25_LONG

如果第一次阅读,请查看写在前面

import re

with open("../examples/ros_bio25_LONG.txt") as f:
    file = f.readlines()

#提取fasta文件
table = {}
for line in file:
    line = re.sub(r'\n', '', line)
    m = re.match('^>.*', line)
    if m:
        name = m.group()
        table[name] = ''
    else:
       table[name] += line
seq = []
for value in table.values():
    seq.append(value)

#寻找重叠序列
sequence = []
for n in range(len(seq)-1):
    front_seq = seq[n]
    rear_seq = seq[n+1]
    overlap = []
    i = 0
    for i in range(len(front_seq)):
        j = i + 1
        for j in range(len(front_seq)+1):
            if rear_seq.find(front_seq[i:j]) == -1:
                break
            else:
                overlap.append(front_seq[i:j])
    sequence.append(max(overlap, key=len))

#记录开始和结尾序列
front = seq[0].replace(sequence[0], '')
rear = seq[-1].replace(sequence[-1], '')

#删去overlap中重复字符串
i = 0
while i < len(sequence):
    temp = sequence[-1]
    if sequence[i] == temp:
        break
    if sequence[i] in sequence[i+1]:
        sequence.pop(i)
    elif sequence[i+1] in sequence[i]:
        sequence.pop(i+1)
    else:
        i += 1

#拼接
contigs = front + ''.join(sequence) + rear
print(contigs)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值