染色体的结构:
>Chr1
nnnnnnnnnnnnnnnnnnnnnnnnnnnn
>Chr2
ATGCATGC
下面是程序:
use strict;
use warnings;
my $dna_filename;
my $DNA0='';
my $DNA1='';
my $DNA2='';
my $DNA3='';
my $DNA4='';
my $DNA5='';
print "please input the path just like this f:\\\\perl\\\\data.txt\n";
chomp($dna_filename=<STDIN>);
open(DNAFILENAME,$dna_filename)||die("can not open the file!");
$/=">";
$DNA0=<DNAFILENAME>;
open (DNA0,">d:\\19\\DNA0.txt");
print DNA0 $DNA0;
close (DNA0);
$DNA1=<DNAFILENAME>;
open (DNA1,">d:\\19\\DNA1.txt");
print DNA1 ">".$DNA1;
close (DNA1);
$DNA2=<DNAFILENAME>;
open (DNA2,">d:\\19\\DNA2.txt");
print DNA2 ">".$DNA2;
close (DNA2);
$DNA3=<DNAFILENAME>;
open (DNA3,">d:\\19\\DNA3.txt");
print DNA3 ">".$DNA3;
close (DNA3);
$DNA4=<DNAFILENAME>;
open (DNA4,">d:\\19\\DNA4.txt");
print DNA4 ">".$DNA4;
close (DNA4);
$DNA5=<DNAFILENAME>;
open (DNA5,">d:\\19\\DNA5.txt");
print DNA5 ">".$DNA5;
close (DNA5);
2013年2月23日更新
上面的程序显然非常的麻烦,现在又重新用到了这个程序,所以进行了了简单的修改,程序在linux下运行:
#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
my $i;
my $DNA;
open(IN,"IRGSP-1.0_genome.fasta")||die("can not open");
$/=">";
for($i=0;$i<13;$i++)
{
$DNA=<IN>;
open(OUT,">chr$i")||die("can not open");
print OUT ">$DNA";
close OUT;
}
最近闲着没事有修改了一下:2013-07-16
这里的i是染色体条数+1;
use strict;
use warnings;
use utf8;
open(IN,"a.txt")||die("can not open");
$/=">";
for(my $i=0;$i<5;$i++)
{
my $DNA=<IN>;
$DNA=~m/(.+)/;
my $name=$1;
open(OUT,">$name.txt")||die("can not open");
$DNA=~m/(.+)>/s;
my $result=$1;
print OUT ">$result";
close OUT;
}