特征转换:.fasta->.numerical
将.fasta格式的数据转换为数字格式的数据
.fasta格式1
在生物信息学中,FASTA格式(又称为Pearson格式),是一种基于文本用于表示核苷酸序列或氨基酸序列的格式。在这种格式中碱基对或氨基酸用单个字母来编码,且允许在序列前添加序列名及注释。— [百度百科]
RNA_m5c数据集
Supporting Information S1. The benchmark dataset consists of a positive dataset and a negative dataset. The former contains 120 true m5C site containing sequences with the m5C site in the center, while the latter contains 120 false m5C site containing sequences. Each of these segments is 41-bp long.