有时候我们要对进化树当中的基因ID或者物种ID进行批量的修改,这里可以利用perl或者python 再结合正则表达式批量搜索替换,给出示例代码:
python实现替换:
import re,os
tree="(chicken,((mouse,rat),(chimp,human)));"
names=re.findall(r"[a-zA-Z0-9_]+",tree)
f=open("newtree.nwk","w")
for i in names:
if not re.match(r'\d+$',i):
tree=tree.replace(i,i+"#1")
f.write(tree+"\n")
f.close()
perl实现替换:
$tree="(chicken,((mouse,rat),(chimp,human)));";
@names=($tree=~/([a-zA-Z0-9_]+)/g);
open OUT ,">newtree.nwk" or die "can't open $name.nwk\n";
for my $name(@names){
if ( $name!~/^\d+$/){
$tree=~s/$name/$name#1/;
}
}
print OUT "$tree\n";
close(OUT);