Rosalind第71题:Genome Assembly with Perfect Coverage

Problem

circular string is a string that does not have an initial or terminal element; instead, the string is viewed as a necklace of symbols. We can represent a circular string as a string enclosed in parentheses. For example, consider the circular DNA string (ACGTAC), and note that because the string "wraps around" at the end, this circular string can equally be represented by (CGTACA), (GTACAC), (TACACG), (ACACGT), and (CACGTA). The definitions of substrings and superstrings are easy to generalize to the case of circular strings (keeping in mind that substrings are allowed to wrap around).

Given: A collection of (error-free) DNA -mers () taken from the same strand of a circular chromosome. In this dataset, all -mers from this strand of the chromosome are present, and their de Bruijn graph consists of exactly one simple cycle.

Return: A cyclic superstring of minimal length containing the reads (thus corresponding to a candidate cyclic chromosome).

圆形串是一个字符串不具有初始或终端元件; 而是将字符串视为符号项链。我们可以将圆形字符串表示为用括号括起来的字符串。例如,考虑圆形DNA字符串(ACGTAC),请注意,因为字符串的末尾“环绕”,所以该圆形字符串同样可以用(CGTACA),(GTACAC),(TACACG),(ACACGT)表示,和(CACGTA)。子字符串和超字符串的定义很容易概括为圆形字符串的情况(请记住,允许子字符串环绕)。

给出:(无错误)DNA的集合 -mers()取自圆形染色体的同一条链。在此数据集中,所有存在来自这条染色体链的多聚体,它们的de Bruijn图恰好由一个简单的周期组成

返回值:包含读取的最小长度的循环超串(因此对应于候选循环染色体)。

Sample Dataset

ATTAC
TACAG
GATTA
ACAGA
CAGAT
TTACA
AGATT

Sample Output

GATTACA
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值