提取、Split、perl脚本实践
1、脚本target,提取基因名字
#!usr/bin/perl
use strict;
use warnings;
my $target = shift;#目标基因文件
my $name = shift;#名字
open O,">/data/00/user/user159/wm/OGgene/dataAnalysis/$name.txt";
open F1,$target;
while (<F1>) {
chomp;
if ($_ =~ /orange/) {
my @n = split/\|/;
print O "$n[1]\n";
}
}
close F1;
close O;
如上图,提取基因 orange1.1gxxxxxxm
批量提取
#!usr/bin/bash
perl target sin.CONSTRACT.OG.gene orange sin_CON
perl target lyc.CONSTRACT.OG.gene Solyc lyc_CON
perl target csa.CONSTRACT.OG.gene Cucsa csa_CON
perl target lan.CONSTRACT.OG.gene Cla lan_CON
perl target cha.CONSTRACT.OG.gene XP cha_CON
perl target sat.CONSTRACT.OG.gene LOC sat_CON
perl target mel.CONSTRACT.OG.gene MELO mel_CON
perl target ana.CONSTRACT.OG.gene Fvb ana_CON
perl target mos.CONSTRACT.OG.gene CmoC mos_CON
perl target sin.EXPANSION.OG.gene orange sin_EXP
perl target lyc.EXPANSION.OG.gene Solyc lyc_EXP
perl target csa.EXPANSION.OG.gene Cucsa csa_EXP
perl target lan.EXPANSION.OG.gene Cla lan_EXP
perl target cha.EXPANSION.OG.gene XP cha_EXP
perl target sat.EXPANSION.OG.gene LOC sat_EXP
perl target mel.EXPANSION.OG.gene MELO mel_EXP
perl target ana.EXPANSION.OG.gene Fvb ana_EXP
perl target mos.EXPANSION.OG.gene CmoC mos_EXP
perl target sic.EXPANSION.OG.gene Lsi sic_EXP
perl target vin.EXPANSION.OG.gene VIT vin_EXP
2、从annnotation文件中得到GO注释
awk -F"\t" '{print $3,$10}' /data/00/user/user159/wm/OGgene/phytozome/Fxananassa/v1.0.a1/annotation/Fxananassa_675_v1.0.a1.annotation_info.txt >Fxananassa_annotation.txt
awk -F"\t" '{print $3,$10}' /data/00/user/user159/wm/OGgene/phytozome/Osativa/v7.0/annotation/Osativa_323_v7.0.annotation_info.txt >Osativa_annotation.txt
awk -F"\t" '{print $3,$10}' /data/00/user/user159/wm/OGgene/phytozome/phyto_mirror/Csinensis_154_v1.1/annotation/Csinensis_154_annotation_info.txt >Csinensis.txt
awk -F"\t" '{print $3,$10}' /data/00/user/user159/wm/OGgene/phytozome/Slycopersicum/ITAG3.2/annotation/Slycopersicum_514_ITAG3.2.annotation_info.txt >Slycopersicum.txt
awk -F"\t" '{print $3,$10}' /data/00/user/user159/wm/OGgene/phytozome/Vvinifera/v2.1/annotation/Vvinifera_457_v2.1.annotation_info.txt >Vvinifera.txt
awk -F"\t" '{print $1,$2}' /data/00/user/user159/wm/OGgene/dataAnalysis/CM3.6.1_GO_anno.txt >CMel3.6.1_anno.txt
awk -F"\t" '{print $1,$2}' /data/00/user/user159/wm/OGgene/dataAnalysis/Lsiceraria_GO_anno.txt >Lsiceraria_anno.txt
awk -F"\t" '{print $1,$4}' /data/00/user/user159/wm/OGgene/dataAnalysis/97103_gene_anno_v2.txt >97103_anno.txt
awk -F"\t" '{print $1,$2}' /data/00/user/user159/wm/OGgene/dataAnalysis/Cmoschata_GOanno_v1.txt >Cmoschata_anno.txt
记录: 2021.11.19
根据得到的目的基因 提取 Go注释
问题:目的基因名 和 存在的Go注释的基因名 有一点点差异,需要调整
3、
#!usr/bin/bash
perl target_Go.perl Csinensis_annotation.txt sin_CON.txt sin_CON
perl target_Go.perl Csinensis_annotation.txt sin_EXP.txt sin_EXP
perl target_Go.perl Cmoschata_anno.txt mos_CON.txt mos_CON
perl target_Go.perl Cmoschata_anno.txt mos_EXP.txt mos_EXP
perl target_Go.perl 97103_anno.txt lan_CON.txt lan_CON
perl target_Go.perl 97103_anno.txt lan_EXP.txt lan_EXP
perl target_Go.perl Fxananassa_annotation.txt ana_CON.txt ana_CON
perl target_Go.perl Fxananassa_annotation.txt ana_EXP.txt ana_EXP
perl target_Go.perl Osativa_annotation.txt sat_CON.txt sat_CON
perl target_Go.perl Osativa_annotation.txt sat_EXP.txt sat_EXP
perl target_Go.perl Slycopersicum_annotation.txt lyc_CON.txt lyc_CON
perl target_Go.perl Slycopersicum_annotation.txt lyc.EXP.txt lyc_EXP
perl target_Go.perl Vvinifera_annotation.txt vin.EXP.txt vin_EXP
perl target_Go.perl Lsiceraria_anno.txt sic_EXP.txt sic_EXP
perl target_Go.perl CMel361_anno.txt mel_CON.txt mel_CON
perl target_Go.perl CMel361_anno.txt mel_EXP.txt mel_EXP
记录:个别文件名出错 提取文件为空