Problem
Figure 1. The suffix tree for s = GTCCGAAGCTCCGG. Note that the dollar sign has been appended to a substring of the tree to mark the end of s. Every path from the root to a leaf corresponds to a unique suffix of GTCCGAAGCTCCGG, and each leaf is labeled with the location in s of the suffix ending at that leaf.
图1 。s的后缀树= GTCCGAAGCTCCGG。请注意,美元符号已附加到树的子字符串中以标记s的结尾。从根到叶子的每条路径都对应一个GTCCGAAGCTCCGG的唯一后缀,并且每个叶子都标记有在后缀s中以该叶子结尾的位置。
Given a string having length , recall that its suffix tree is defined by the following properties:
- is a rooted tree having exactly leaves.
- Every edge of is labeled with a substring of , where is the string formed by adding a placeholder symbol
$
to the end of . - Every internal node of other than the root has at least two children; i.e., it has degree at least 3.
- The substring labels for the edges leading down from a node to its children must begin with different symbols.
- By concatenating the substrings along edges, each path from the root to a leaf corresponds to a unique suffix of .
Figure 1 contains an example of a suffix tree.
Given: A DNA string of length at most 1kbp.
Return: The substrings of encoding the edges of the suffix tree for . You may list these substrings in any order.
给定一个具有length的字符串,请记住,其后缀树由以下属性定义:
- 是根树恰好具有叶子。
- 每边的标记有的子串,其中是通过将一个占位符符号形成的串到的末尾。
$
- 每个内部节点的以外的根具有至少两个孩子; 也就是说,它的学位至少为3。
- 从节点到子节点的边缘的子字符串标签必须以不同的符号开头。
- 通过连接沿边缘从根每条路径到叶对应的子串,一个唯一的后缀的。
图1 包含后缀树的示例。
给定:长度不超过1 kbp的DNA字符串。
返回值:对后缀树的边缘进行编码的子字符串。您可以按任何顺序列出这些子字符串。
Sample Dataset
ATAAATG$
Sample Output
AAATG$ G$ T ATG$ TG$ A A AAATG$ G$ T G$ $