生信漫谈如何利用MEGA7构建系统进化树

前言

生物技术近年发展越来迅猛,掌握一门生信语言或者一个生信软件的使用,这将为我们的科研学习之路提供非常大的便利。今天我们主要来介绍如何用MEGA7进行进化树。

1、下面以MEGA7为例来进行讲解,下面是下载地址,大家根据自己的系统进行下载即可。

http://www.megasoftware.net/

 

 

2、序列的准备,必须是fasta结尾的格式,其他像txt格式,软件不能识别,以下以拟南芥SPL15基因的蛋白序列为例,进行同源序列查找

>NP_191351.1 SPL15 [organism=Arabidopsis thaliana] [GeneID=824961]
MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSN
VKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALF
TSHYSRIAPSLYGNPNAAMIKSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPE
MINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQ
YLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGFELYLHQQVLKQYMEPENTRAYDSSPQHF
NWSL

选择其中同源序列高的前19条蛋白序列进行下载进行示范

 

下载后的序列形式:

>AST51816.1 Venus [Cloning vector pSTB205]
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKLICTTGKLPVPWPTLVTTLGYGLQCFARYPDHMK
QHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYITADKQKN
GIKANFKIRHNIEDGGVQLADHYQQNTPIGDGPVLLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKE
LLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKVCC
IHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKS
VLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTN
TWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGF
ELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>NP_191351.1 squamosa promoter binding protein-like 15 [Arabidopsis thaliana]
MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP
TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG
GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>KAG7634825.1 SBP domain superfamily [Arabidopsis suecica]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP
TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG
GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>CAA0387110.1 unnamed protein product [Arabidopsis thaliana]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP
TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMG
GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>CAD5326126.1 unnamed protein product [Arabidopsis thaliana]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKSVLGDP
TAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPS
SGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMGGFELYLH
QQVLKQYMEPENTRAYDSSPQHFNWSL
>KAG7561265.1 SBP domain superfamily [Arabidopsis thaliana x Arabidopsis arenosa]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYSRIAPSLYGNPNAAMI
KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP
TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGTTMGGFELSLH
QQVLRQYMEPENTRAYDSSPQHFNWSL
>XP_002878178.1 squamosa promoter-binding-like protein 15 [Arabidopsis lyrata subsp. lyrata]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYTRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMRRSGPWQINPERESHQIMNVLSHGSSSFTTCPEITNNNSTDSSCALSLLSNSNPIQQQQLQTP
TNLWRPSSGFDSMISFSDRVTMAQPPPISTHHQYLSQTWDVMAGGKSNSHYMSPVSQISEPAEFQISNGTTMGGFELSLH
QQVLRQYMEPENTRAYDSSPQHFNWSL
>KAG7566101.1 SBP domain [Arabidopsis suecica]
MELLMGSGHAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQLEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTSSLFTSRYSRIAPSLYGNPNAAMI
KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP
TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPAGFQISNGTTMGGFELSLH
QQVLRQYMEPENTRAYDSSPQHFNWSL
>CAE6076605.1 unnamed protein product [Arabidopsis arenosa]
MRRGRGKGKRQNATAREDRGSGEEEKIPAFRRRGRPQKPVKDEIEEEEVELVKKTEEEEDKDDDTNGSVTSKEDVTENGR
KRKKPVESKESNITEEENGVGSKSSTEDSMKSSSSIGFRQNGSRRKNKPRRAAEAVVECNGAESGGSSSTESSSLSGGLR
FGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQ
LSEFDLEKRSCRRRLACHNERRRKPQSTTSLFTSRYSRIAPSLYGNPNAAMIKSVLGDPMAWSTAKSVMRRSGPWQINPE
RESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTPTNLWRPSSGFDSLISFSDRVTMAQPPPI
STHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGSTMGGFELSLHQQVLRQYMEPENTRAYDSSPQHFNWSL
>XP_006291402.1 squamosa promoter-binding-like protein 15 [Capsella rubella]
MELLMGSGQAESGGSSSTESSLLSGGLRFGQKIYFEDGSGSGSKNRVSTGHKSSMTTVARCQVEGCKMDLSNAKAYYSRH
KVCCIHSKSSKVIVSGLHQRFCQQCSRFHHLSEFDLEKRSCRRRLACHNERRRKPQPATLFTSHYTRIAPSLYGNANAAM
IKSVLGDPTAWSTSRSVMRSSGPWQINPVKESNQLMNVYSQESSSFTITCPEMMNNNSTDSGCALSLLSNSNPIQQQQQQ
PQTQTNIWRSSSGFDSMILDRVTMAQPPPISGHHQYLNQTLAFMAGEKSNSHYMSPVLGPSQISEPDEFQISNGTTMDGF
ELSLHQQVLRQYMEPENTRAYDSSPHYFNWSL
>CAH2063751.1 unnamed protein product [Thlaspi arvense]
MELLMGSGQNRTESYGSSSTESSSLSGGLRFGQKIYFEDGSGSGGGSNKNRVNTGRKSRTARCQVEGCRMDLSNVKTYYS
RHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTSLLTSRYSRIAPSLYGNAN
TAMIRSVLGDPTAWSTARSVMRRSAPWQINPERESHQLMNVFSHDSSSFTTTCPEMMNSNGTDSSCALSLLSNSNTNQQQ
QLLQTSTNIWRPSSGFDSANADRATMAQPPPVSNQHQYLNQTWEFMAGEKSNSHYLSPVLGLSQISEPVDFQISNGTTMG
GFELSIHQQVLRHYMEPENTRAYDSSAQHFNWSL
>XP_010516431.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa]
MELLIGGSGQTESGGASSTKSSSLSGGLRFGQKIYFEDGSGSGSKNRVGTGHKSSTTTTTARCQVEGCKMDLSNAKAYYS
RHKVCCIHSKSSKVIVSGLRQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTSQYTRIAPSLYGDANA
AMMKSVLGDPTVWSTARSVMRRSGPWQISPVKESHHQLMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNSNP
IQQQQQQLQTQTHIWRPSLGFDSMTVDRVTMAQPPPISSHHQYLNQTLEFMAGEKSSSHYMSPVLGPSQISEPDEFQISN
GTTMDGFELSLHQQVLRQYMEPENTRAYDSSPHHFNWSL
>AKC05620.1 squamosa promoter-binding-like protein 15 [Cardamine hirsuta]
MELLMGSGQSESGASSSNESSSLSGGLRFGQKIYFEDGSGSGSKNRVSSTGRKSSTTTARCQVEGCRMDLSNAKTYYSRH
KVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPATTLFTSRFTRTAPSHYGNANAA
MIKSVLGDPTAWTAERSVMRRSAPWQSNPSHQVMIDFSHGSSSLTTTCPEMMNNTSTDSSCALSLLSNSNQTQQLQQQLQ
TPANIWRASSGFDSMIADRVTMAQPPPISTHHQYLNQSWEFMPGEKNDSHYMSPMSQISEPADLHMRNRTTMGGFEVSLH
QQVMRQYMAPENTRAYDSSPQHFNWSL
>XP_010504729.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa]
MELLMGGSGQTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVGAGHKSSTTARCQVEGCKMDLSNAKAYYSRHK
VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTRIAASLYGNANAAMIKSVL
GDPTVWSTARSVMRRSGPWQINPVKESHHQHMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNPIQQQQQQLQ
TQTNIWRPSSGFDYMTVDRVTLAQPPPIPSHHQYLNQTLEFMTGEKNSSHYMSPALGPSQISAPDEFQISNGTTMDGFEL
SLHQQVLRQYMAPENTRAYDSSPHHFNWSL
>CAA7060637.1 unnamed protein product [Microthlaspi erraticum]
MELLMDSSQTESGGSSSIESSSLTGGLRFGQKIYFEDGSGSGAKSSKNRVNTARKSSTSTARCQVEGCRMDLSNAKTYYS
RHKVCCIHSKSSNVIVSGLHQRFHLLSEFDLEKRSCRRRLACHNERRRKPHATTNLLTSRYSRIAPSLYENANTAIFRSV
LGDTTAWSAARPVMRRSGPWQINPERESNLNVFSHGSSSFTTCPAMMNNNSTDSSCALSLLSNSNTNTNQQQQQPLQTST
DTWRPSSGFDSMIADRVTMAQPPPVSIHNQYLNQSWDFMEGEKSNSHHMSPVLGLSQISEPADFQLSNGMGGGFELSLHQ
QVLKQYMEPENTRAYDSSPQHFNWSL
>KAG2324838.1 hypothetical protein Bca52824_007566 [Brassica carinata]
MELLMGSGQDHPQSAGSSSTLSGGLRFGQKIYFEDGSGAGLSRNRVNNTGRKSMTARCQVEGCRMDLSNAKTYYSRHKVC
CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQTTTTLLTSHYSSIAPSLYGNAIRSVLG
DPTLWSTARGSSAPWQINPERESHHQLMNIISFGSSSFTNSTDSSCALSLLSNSNRNQQEQQPLQTPTNAWRPSLDFDSI
VADRVTMAQPPPVSIQNQYLNQTWEFMSGEKSNAHCISPVLGLSQISEPVDFQTSNGATMSGVELSLHQQVLRQYLEPEN
TRAYDSSHQHFNWSL
>CAH8384605.1 unnamed protein product [Eruca vesicaria subsp. sativa]
MELEMGSGQKKPESAGSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVSSTGRKSMTARCQVEGCRTDLSNAKTYYSRHKVC
CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTLLTSRYSSLYGNAIRSVLGDPTT
WSTARGSAPWKINQESDRHQLMNVISFGSSSFTTCPEMMNNNSTDSSCALSLLSNSNPNQQEQQPLQTSNTIWRPSLDFD
STVADRVTMAQPPPVSMQNQYLNQTWEFMSGEKSNAQCISPVLGQSQISEPVDFQIGTTMGGGFELSLHQQVLRQYMEPE
NTRAYDTSPQYFNWSL
>KAF8114775.1 hypothetical protein N665_0034s0114 [Sinapis alba]
MELLMGSGQNQPESAGSSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVNTGRKSTTARCQVEGCRMDLSSAKTYYSRHKVC
CIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTFLTSHYSSIAPSLYGNAIRGVLG
DSTTWSTARGSAPLQINPERESHRLMNVFSFGSSSFTNNSTDSSCALSLLSNSNPNQQEQQPLQTPTNTWRPSLDFDSIV
ADRVTMAQPPPVSVQNQYLNQTWEFMSGEKSNGQHYISPVLGLSQISEPVDFQISNGATMSGVELSLHQQVLRQYLEPEN
TRAYDSSPQHFNWSL
>XP_010427684.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa]
MELLMGGTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVVTGHKSSTTTTTARCQVEGCKMDLSNAKAYYSRHK
VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLFTSHYTRIAPSLYGNANAAMI
KSVLGDPTVWSTARSVMRRSGPWQINPVKESHHQLMNVFSQESSSFTITCPEMMNNNNSTDSSCALSLLSNSNSNPIQQQ
QQQLQTQTNIWRPSLGFDSMTVDRVTLAQPPPILSHHQYMSPVLGPSQISAPDEFQISNVTTMDGFELSLHQQVLRQYME
PQNTRAYDSSPHHFNWSL
 
 

3、导入蛋白序列,点击File菜单栏导入或者直接拖进MEGA7软件都可以

 

 

以上任一种形式都可以。

4、多序列比对,导入成功后如下图所示。

 

选择Alignment > Align by ClustalW > OK > 默认参数

 

 

比对后的结果如下图:

 

5、系统进化树构建,选择NJ法进行构建系统进化树

 

 

 

6、结果展示

第一种,步长树

 

第二种步长对齐树

 

其他形状的树,点击以上按钮进行展示

 

7、保存nwk树文件,导入树图片进行美化

 

 

 

生信漫谈

生信漫谈,认识生信,学习生信,跨越生信入门路上的障碍,从而利用生信技术解决科研学习路上的绊脚石!

  • 5
    点赞
  • 25
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要在生信领域使用Linux系统,可以按照以下步骤进行安装: 1. 首先,选择一个合适的Linux发行版。在生信领域中,常用的发行版有Ubuntu、CentOS和Fedora等。你可以根据个人喜好和需求选择其中之一。 2. 下载所选发行版的镜像文件。你可以在官方网站上找到适用于你的计算机的镜像文件,并下载到本地。 3. 制作启动盘。将下载的镜像文件写入USB闪存驱动器或光盘,并制作成可启动的安装介质。 4. 将启动盘插入计算机,并重新启动计算机。在启动过程中,按下对应的按键(通常是F2、F10、F12或DEL),进入计算机的BIOS设置。 5. 在BIOS设置中,将启动顺序调整为从USB或光盘启动。保存设置并重新启动计算机。 6. 进入安装界面后,按照提示进行安装。选择合适的语言、时区和键盘布局等设置,并选择安装类型(建议选择完全安装)。 7. 在分区设置中,可以选择自动分区或手动分区。如果你对分区不熟悉,建议选择自动分区。 8. 设置用户名和密码,并等待安装完成。 9. 完成安装后,重新启动计算机,并使用你设置的用户名和密码登录系统。 10. 根据需要,安装生物信息学相关的软件和工具。你可以使用包管理器(如apt或yum)来安装所需的软件包。 安装完成后,你就可以开始在Linux系统上进行生信分析和开发工作了。记得定期更新系统和软件,以保持系统的安全性和稳定性。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值