调用命令:python cal_chrom_length.py hg19.fasta hg19.len
import sys
import os
fasta_path = sys.argv[1]
len_path = sys.argv[2]
temp_path = len_path + ".temp"
fw = open(len_path, "w+")
os.system("fastalength " + fasta_path + " > " + temp_path)
with open(temp_path) as f:
lines = f.readlines()
for line in lines:
cols = line.strip("\n").split()
fw.write("\t".join([cols[1], cols[0]]) + "\n")
fw.close()
os.remove(temp_path)
输出的hg19.len文件的结构是:染色体名称<TAB>长度,适用于下游的genomeCoverageBed、bedGraphToBigWig等分析。