NCBI中下载的基因组蛋白序列,注释信息和序列混在一块。写了以下脚本进行清洗一下 library(seqinr) all_pep <- read.fasta("GCA_014462685.1_ASM1446268v1_cds_from_genomic.faa")#读入基因组的蛋白序列 for (i in 1:length(all_pep)) { write.fasta(all_pep[[i]],attr(all_pep[[i]],"name"),"Rhynchophorus ferrugineus_cds.fas",open = "a") }#重新读入文件