Rosalind第75题:Encoding Suffix Trees

Problem

Figure 1. The suffix tree for s = GTCCGAAGCTCCGG. Note that the dollar sign has been appended to a substring of the tree to mark the end of s. Every path from the root to a leaf corresponds to a unique suffix of GTCCGAAGCTCCGG, and each leaf is labeled with the location in s of the suffix ending at that leaf.

图1 s的后缀树= GTCCGAAGCTCCGG。请注意,美元符号已附加到树的子字符串中以标记s的结尾。从根到叶子的每条路径都对应一个GTCCGAAGCTCCGG的唯一后缀,并且每个叶子都标记有在后缀s中以该叶子结尾的位置。

Given a string  having length , recall that its suffix tree  is defined by the following properties:

  •  is a rooted tree having exactly  leaves.
  • Every edge of  is labeled with a substring of , where  is the string formed by adding a placeholder symbol $ to the end of .
  • Every internal node of  other than the root has at least two children; i.e., it has degree at least 3.
  • The substring labels for the edges leading down from a node to its children must begin with different symbols.
  • By concatenating the substrings along edges, each path from the root to a leaf corresponds to a unique suffix of .

Figure 1 contains an example of a suffix tree.

Given: A DNA string  of length at most 1kbp.

Return: The substrings of  encoding the edges of the suffix tree for . You may list these substrings in any order.

给定一个具有length的字符串,请记住,其后缀树由以下属性定义:

  • 根树恰好具有叶子
  • 的标记有的子串,其中是通过将一个占位符符号形成的串到的末尾。$
  • 每个内部节点的以外的根具有至少两个孩子; 也就是说,它的学位至少为3。
  • 从节点到子节点的边缘的子字符串标签必须以不同的符号开头。
  • 通过连接沿边缘从根每条路径到叶对应的子串,一个唯一的后缀的。

图1 包含后缀树的示例。

给定:长度不超过1 kbp的DNA字符串。

返回值:对后缀树的边缘进行编码的子字符串。您可以按任何顺序列出这些子字符串。

 

Sample Dataset

ATAAATG$

Sample Output

AAATG$
G$
T
ATG$
TG$
A
A
AAATG$
G$
T
G$
$
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值