子实体形态相关基因鉴定

糖类活性酶功能注释

将包含各糖类活性酶家族的隐马尔科夫序列特征谱下载自dbCAN数据库(Yin et al.,2012)。使用Hmmscan软件(Eddy,2009)进行糖类活性酶的注释,在这一过程中,序列特征谱作为搜索的目标,而包含各真菌蛋白质组序列的文件作为搜索对象。生成的初步结果使用dbCAN提供的hmmscan-parser脚本程序进行处理。即 CAZymes

蛋白激酶功能注释

通过BlastpKinBasehttp://kinase.com/kinbase/数据库进行比对,选用e-value < e-10进行筛选,使用perl软件包对蛋白家族的基因数量进行分类统计,得到FUNCAT数据库的数量矩阵。

KinBase: Kinase Database at Manning's Group

The Mushroom Kinome

# 下载好的KinBase数据库位置如下
/media/aa/DATA/SZQ2/bj_software/KinBase/C.cinerea_kin_dom.fasta
/media/aa/DATA/SZQ2/bj_software/KinBase/C.cinerea_AAprotein.fasta
# 在(pfam_scan)下
conda activate pfam_scan
# 构建数据库
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/KinBase/Ccinerea_kin_dom.fasta -d Ccinerea_kin_dom
# 新建文件夹
mkdir 18.KinBase && cd 18.KinBase
# 1)diamond blastp   evalue为1e-10
for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/KinBase/Ccinerea_kin_dom.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.KinBase10.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.KinBase.list
ParaFly -c command.KinBase.list -CPU 48
# 新建文件夹
mkdir KinBase10.tab && cd KinBase10.tab
# 2)parsing_blast_result.pl
for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.KinBase10.xml > $i.KinBase10.tab"
done > command.KinBase.list
ParaFly -c command.KinBase.list -CPU 48
# 在(jcvi)下
conda activate jcvi
# 3)比对结果中筛选每个query的最佳subject
for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.KinBase10.tab"
done > command.jcvi.list
ParaFly -c command.jcvi.list -CPU 48
# 4)复制并重命名
mkdir best && cd best
cp ../*.KinBase10.tab.best ./
# 查看每个文件里有多少行,“行数-1”即为注释出给结果总数
wc -l *.KinBase10.tab.best 

微管蛋白功能注释

将包含微管蛋白家族(PF00091.20Misato家族(PF10644_misato种子序列的HMM谱下载自Pfam数据库。使用Hmmsacn软件进行序列的鉴定。初步得到的微管蛋白序列通过系统发育分析方法进行家族分类。

直接从之前的Pfam结果里面筛选。

Pfam: Family: Tubulin (PF00091)

Family: Misat_Tub_SegII (PF10644)

Family: NAD_binding_10 (PF13460)

Family: Tubulin_2 (PF13809)

Family: Tubulin_3 (PF14881)

Family: Tubulin_C (PF03953)

GCP_C_terminalPF04130.16Gamma tubulin complex component C-terminal
GCP_N_terminalPF17681.4Gamma tubulin complex component N-terminal
MOZART1PF12554.11Mitotic-spindle organizing gamma-tubulin ring associated
TBCAPF02970.19Tubulin binding cofactor A
TBCCPF07986.15Tubulin binding cofactor C
TBCC_NPF16752.8Tubulin-specific chaperone C N-terminal domain
TFCD_CPF12612.11Tubulin folding cofactor D C terminal
TTLPF03133.18Tubulin-tyrosine ligase family
TubulinPF00091.28Tubulin/FtsZ family, GTPase domain
Tubulin_3PF14881.9Tubulin domain
Tubulin_CPF03953.20Tubulin C-terminal domain
Misat_Tub_SegIIPF10644.12Misato Segment II tubulin-like domain
NAD_binding_10PF13460.9NAD_binding_10

交配型(MAT)基因座的基因组结构分析

基于Non-Redundant Protein DatabaseNR)(https://www.ncbi.nlm.nih.gov/protein/)、Swiss-Prothttps://www.uniprot.org/)和Pfamhttp://pfam.xfam.org/)数据库鉴定同源域转录因子基因(HD)和信息素/受体基因,并通过BLAST搜索进一步鉴定MAT基因座旁的基因。以下序列被用作查询(在NCBI下载)

用于鉴定线粒体中间肽酶基因(mip

来自Fomitiporia mediterraneaXP_007265184.1

来自Schizophyllum commune的XP_003038723.1

来自平菇Pleurotus ostreatusXP_036634433.1

来自双孢蘑菇Agaricus bisporusXP_007325204.1

来自云芝Trametes versicolorXP_008032819.1

则用于鉴定β侧翼基因(β-fg

来自Heterobasidion irregularXP_009540982.1

来自双孢蘑菇Agaricus bisporus的XP_006454075.1

来自Coprinopsis cinerea的XP_001829147.2

# 下载好的PHI数据库位置如下
/media/aa/DATA/SZQ2/bj_software/MAT/mip/
XP_003038723_1.fasta  XP_007265184_1.fasta  XP_007325204_1.fasta  XP_008032819_1.fasta  XP_036634433_1.fasta
/media/aa/DATA/SZQ2/bj_software/MAT/βfg/
XP_001829147_2.fasta  XP_006454075_1.fasta  XP_009540982_1.fasta
# 在(pfam_scan)下
conda activate pfam_scan
# 构建数据库
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_003038723_1.fasta -d XP_003038723_1
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_007265184_1.fasta -d XP_007265184_1
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_007325204_1.fasta -d XP_007325204_1
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_008032819_1.fasta -d XP_008032819_1
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_036634433_1.fasta -d XP_036634433_1
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/βfg/XP_001829147_2.fasta -d XP_001829147_2
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/βfg/XP_006454075_1.fasta -d XP_006454075_1
diamond makedb --in /media/aa/DATA/SZQ2/bj_software/MAT/βfg/XP_009540982_1.fasta -d XP_009540982_1
# 新建文件夹
mkdir 20.MAT && cd 20.MAT
# 1)diamond blastp   evalue为1e-10
for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_003038723_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_003038723_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_003038723_1.list
ParaFly -c command.XP_003038723_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_007265184_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_007265184_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_007265184_1.list
ParaFly -c command.XP_007265184_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_007325204_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_007325204_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_007325204_1.list
ParaFly -c command.XP_007325204_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_008032819_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_008032819_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_008032819_1.list
ParaFly -c command.XP_008032819_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/mip/XP_036634433_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_036634433_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_036634433_1.list
ParaFly -c command.XP_036634433_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/βfg/XP_001829147_2.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_001829147_210.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_001829147_2.list
ParaFly -c command.XP_001829147_2.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/βfg/XP_006454075_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_006454075_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_006454075_1.list
ParaFly -c command.XP_006454075_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "diamond blastp --db /media/aa/DATA/SZQ2/bj_software/MAT/βfg/XP_009540982_1.fasta --query /media/aa/Expansion/szq2/bj/b.OrthoFinder/compliantFasta4/compliantFasta/$i.fasta --out $i.XP_009540982_110.xml --outfmt 5 --sensitive --max-target-seqs 20 --evalue 1e-10 --id 20 --tmpdir /dev/shm --index-chunks 1"
done > command.XP_009540982_1.list
ParaFly -c command.XP_009540982_1.list -CPU 48

# 新建文件夹
mkdir MAT10.tab && cd MAT10.tab
# 2)parsing_blast_result.pl
for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_003038723_110.xml > $i.XP_003038723_110.tab"
done > command.XP_003038723_1.list
ParaFly -c command.XP_003038723_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_007265184_110.xml > $i.XP_007265184_110.tab"
done > command.XP_007265184_1.list
ParaFly -c command.XP_007265184_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_007325204_110.xml > $i.XP_007325204_110.tab"
done > command.XP_007325204_1.list
ParaFly -c command.XP_007325204_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_008032819_110.xml > $i.XP_008032819_110.tab"
done > command.XP_008032819_1.list
ParaFly -c command.XP_008032819_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_036634433_110.xml > $i.XP_036634433_110.tab"
done > command.XP_036634433_1.list
ParaFly -c command.XP_036634433_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_001829147_210.xml > $i.XP_001829147_210.tab"
done > command.XP_001829147_2.list
ParaFly -c command.XP_001829147_2.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_006454075_110.xml > $i.XP_006454075_110.tab"
done > command.XP_006454075_1.list
ParaFly -c command.XP_006454075_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "/media/aa/DATA2/bin/parsing_blast_result.pl --evalue 1e-10 --HSP-num 1 --out-hit-confidence --suject-annotation ../$i.XP_009540982_110.xml > $i.XP_009540982_110.tab"
done > command.XP_009540982_1.list
ParaFly -c command.XP_009540982_1.list -CPU 48

# 在(jcvi)下
conda activate jcvi
# 3)比对结果中筛选每个query的最佳subject
for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_003038723_110.tab"
done > command.jcviXP_003038723_1.list
ParaFly -c command.jcviXP_003038723_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_007265184_110.tab"
done > command.jcviXP_007265184_1.list
ParaFly -c command.jcviXP_007265184_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_007325204_110.tab"
done > command.jcviXP_007325204_1.list
ParaFly -c command.jcviXP_007325204_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_008032819_110.tab"
done > command.jcviXP_008032819_1.list
ParaFly -c command.jcviXP_008032819_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_036634433_110.tab"
done > command.jcviXP_036634433_1.list
ParaFly -c command.jcviXP_036634433_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_001829147_210.tab"
done > command.jcviXP_001829147_2.list
ParaFly -c command.jcviXP_001829147_2.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_006454075_110.tab"
done > command.jcviXP_006454075_1.list
ParaFly -c command.jcviXP_006454075_1.list -CPU 48

for i in `cat /media/aa/DATA/SZQ2/bj/functional_annotation/94listssp.txt`
do
    echo "python -m jcvi.formats.blast best -n 1 $i.XP_009540982_110.tab"
done > command.jcviXP_009540982_1.list
ParaFly -c command.jcviXP_009540982_1.list -CPU 48
# 4)复制并重命名
mkdir best && cd best
cp ../*.tab.best ./
# 查看每个文件里有多少行,“行数-1”即为注释出给结果总数
wc -l *.XP_003038723_110.tab.best
wc -l *.XP_007265184_110.tab.best
wc -l *.XP_007325204_110.tab.best
wc -l *.XP_008032819_110.tab.best
wc -l *.XP_036634433_110.tab.best
wc -l *.XP_001829147_210.tab.best
wc -l *.XP_006454075_110.tab.best
wc -l *.XP_009540982_110.tab.best

 统计数据

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值