Rosalind第14题:Finding a Shared Motif

Problem

common substring of a collection of strings is a substring of every member of the collection. We say that a common substring is a longest common substring if there does not exist a longer common substring. For example, "CG" is a common substring of "ACGTACGT" and "AACCGTATA", but it is not as long as possible; in this case, "CGTA" is a longest common substring of "ACGTACGT" and "AACCGTATA".

Note that the longest common substring is not necessarily unique; for a simple example, "AA" and "CC" are both longest common substrings of "AACC" and "CCAA".

Given: A collection of  () DNA strings of length at most 1 kbp each in FASTA format.

Return: A longest common substring of the collection. (If multiple solutions exist, you may return any single solution.)

一个常见的子字符串的集合是一个集的每一个成员。我们说, 如果不存在更长的公共子字符串,则公共子字符串是最长的公共子字符串。例如,“ CG”是“ A CG TACGT”和“ AAC CG TATA”的通用子字符串,但是它的长度不能太长;在这种情况下,“ CGTA”是“ A CGTA CGT”和“ AAC CGTA TA”的最长公共子串。

注意,最长的公共子字符串不一定是唯一的。举一个简单的例子,“ AA”和“ CC”都是“ AACC”和“ CCAA”的最长公共子串。

给定的:FASTA格式的()DNA串的集合,每个 的长度最大为1 kbp

返回:集合的最长公共子字符串。(如果存在多个解决方案,则可以返回任何单个解决方案。)

Sample Dataset

>Rosalind_1
GATTACA
>Rosalind_2
TAGACCA
>Rosalind_3
ATACA

Sample Output

AC

python解决方案

s = """>Rosalind_1
GATTACA
>Rosalind_2
TAGACCA
>Rosalind_3
ATACA""".split(">")[1:]

for i in range(len(s)):
    s[i] = s[i].replace("\n", '')
    while s[i][0] not in "ACGT":
        s[i] = s[i][1:]
# ^^^^^^^^^^^^^ all of that to format in FAST in array
#Get shortest of DNA strings
index = s.index(min(s, key=len))

motif = ''
shortest = s[index]

#cycle over the DNA string letters
for i in range(len(shortest)):
    n = 0
    present = True
    while present:
            #cycle inside over all other DNA strings and if it's present in there considered a motif and length gets increased by 1
        for each in s:
            if shortest[i:i+n] not in each or n>1000:
                present = False
                break
        if present:
            motif = max(shortest[i:i+n], motif, key=len)
        n += 1
print (motif)

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值