Problem
One of the first and commonly used programs for MSA is Clustal, developed by Des Higgins in 1988. The current version using the same approach is called ClustalW2, and it is embedded in many software packages. There is even a modification of ClustalW2 called ClustalX that provides a graphical user interface for MSA.
See the link below for a convenient online interface that runs Clustal on the EBI website:
Select "Protein" or "DNA", then either paste your sequence in one of the listed formats or upload an entire file. To obtain a more accurate alignment, leave Alignment type: slow
selected: if you choose to run Clustal on only two sequences, then the parameter options correspond to those in Needle
(see “Pairwise Global Alignment”).
Given: Set of nucleotide strings in FASTA format.
Return: ID of the string most different from the others.
Clustal是最早的MSA程序之一,它由Des Higgins于1988年开发。使用相同方法的当前版本称为ClustalW2,它已嵌入许多软件包中。甚至还有一个名为ClustalX的ClustalW2修改,它为MSA提供了图形用户界面。
有关在EBI网站上运行Clustal的便捷在线界面,请参见下面的链接:
选择“蛋白质”或“ DNA”,然后以列出的格式之一粘贴序列或上传整个文件。为了获得更准确的比对,请保持Alignment type: slow
选中状态:如果您选择仅对两个序列运行Clustal,则参数选项与中的相对应Needle
(请参见“逐对全局比对”)。
给定: FASTA格式的核苷酸字符串集。
返回值:字符串ID与其他字符串最不同。
Sample Dataset
>Rosalind_18 GACATGTTTGTTTGCCTTAAACTCGTGGCGGCCTAGCCGTAAGTTAAG >Rosalind_23 ACTCATGTTTGTTTGCCTTAAACTCTTGGCGGCTTAGCCGTAACTTAAG >Rosalind_51 TCCTATGTTTGTTTGCCTCAAACTCTTGGCGGCCTAGCCGTAAGGTAAG >Rosalind_7 CACGTCTGTTCGCCTAAAACTTTGATTGCCGGCCTACGCTAGTTAGTTA >Rosalind_28 GGGGTCATGGCTGTTTGCCTTAAACCCTTGGCGGCCTAGCCGTAATGTTT
Sample Output
Rosalind_7