用Python实现 获取fasta的头文件

fasta_file = open('SwissProt.fasta', 'r')
out_file = open('SwissProt.header', 'w')
for line in fasta_file:
    if line[0:1] == '>':     ##line这一行的第一个值,[1:2]第二个值
        out_file.write(line)
out_file.close()

fasta:

sp|Q664P8|TAUB_YERPS Taurine import ATP-binding protein TauB OS=Yersinia pseudotuberculosis serotype I (strain IP32953) OX=273123 GN=tauB PE=3 SV=1
MLNVSGLWAEYQGKPALQDVSLQIASGQLVVVLGPSGCGKTTLLNLIAGFMTPSAGVITL
DNIPVSGPSAERGVVFQNEGLLPWRDVVSNVEFGLQLAGMSKEQRRVTALKMLNRVGLAG
FEHHFIWQLSGGMRQRVGIARALAVDPRLLLLDEPFGALDAFTREQMQELLLTIWRDTGK
QILLITHDIEEAVFLASELLLLSPGPGQVVERLSLNFGQRYAEGEPCRAIKSDPEFIARR
EYVLGKVFQQREVLI
sp|Q66K14|TBC9B_HUMAN TBC1 domain family member 9B OS=Homo sapiens OX=9606 GN=TBC1D9B PE=1 SV=3
MWLSPEEVLVANALWVTERANPFFVLQRRRGHGRGGGLTGLLVGTLDVVLDSSARVAPYR
ILHQTQDSQVYWTVACGSSRKEITKHWEWLENNLLQTLSIFDSEEDITTFVKGKIHGIIA
EENKNLQPQGDEDPGKFKEAELKMRKQFGMPEGEKLVNYYSCSYWKGRVPRQGWLYLTVN
HLCFYSFLLGKEVSLVVQWVDITRLEKNATLLFPESIRVDTRDQELFFSMFLNIGETFKL
MEQLANLAMRQLLDSEGFLEDKALPRPIRPHRNISALKRDLDARAKNECYRATFRLPRDE
RLDGHTSCTLWTPFNKLHIPGQMFISNNYICFASKEEDACHLIIPLREVTIVEKADSSSV
LPSPLSISTKSKMTFLFANLKDRDFLVQRISDFLQKTPSKQPGSIGSRKASVVDPSTESS
PAPQEGSEQPASPASPLSSRQSFCAQEAPTASQGLLKLFQKNSPMEDLGAKGAKEKMKEE
SWHIHFFEYGRGVCMYRTAKTRALVLKGIPESLRGELWLLFSGAWNEMVTHPGYYAELVE
KSTGKYSLATEEIERDLHRSMPEHPAFQNELGIAALRRVLTAYAFRNPTIGYCQAMNIVT
SVLLLYGSEEEAFWLLVALCERMLPDYYNTRVVGALVDQGIFEELTRDFLPQLSEKMQDL
GVISSISLSWFLTLFLSVMPFESAVVIVDCFFYEGIKVILQVALAVLDANMEQLLGCSDE
GEAMTMLGRYLDNVVNKQSVSPPIPHLRALLSSSDDPPAEVDIFELLKVSYEKFSSLRAE
DIEQMRFKQRLKVIQSLEDTAKRSVVRAIPVDIGFSIEELEDLYMVFKAKHLASQYWGCS
RTMAGRRDPSLPYLEQYRIDASQFRELFASLTPWACGSHTPLLAGRMFRLLDENKDSLIN
FKEFVTGMSGMYHGDLTEKLKVLYKLHLPPALSPEEAESALEAAHYFTEDSSSEASPLAS
DLDLFLPWEAQEALPQEEQEGSGSEERGEEKGTSSPDYRHYLRMWAKEKEAQKETIKDLP
KMNQEQFIELCKTLYNMFSEDPMEQDLYHAIATVASLLLRIGEVGKKFSARTGRKPRDCA
TEEDEPPAPELHQDAARELQPPAAGDPQAKAGGDTHLGKAPQESQVVVEGGSGEGQGSPS
QLLSDDETKDDMSMSSYSVVSTGSLQCEDLADDTVLVGGEACSPTARIGGTVDTDWCISF
EQILASILTESVLVNFFEKRVDIGLKIKDQKKVERQFSTASDHEQPGVSG
sp|Q8K9I1|SYV_BUCAP Valine–tRNA ligase OS=Buchnera aphidicola subsp. Schizaphis graminum (strain Sg) OX=198804 GN=valS PE=3 SV=1
MKKNYNPKDIEEHLYNFWEKNGFFKPNNNLNKPAFCIMMPPPNITGNLHMGHAFQQTIMD
ILIRYNRMQGKNTLWQVGTDHAGIATQILIERQIFSEERKTKKDYSRNDFIKKIWKWKKK
SNFSVKKQMKRLGNSVDWDREKFTLDPDISNSVKEAFIILYKNNLIYQKKRLVHWDSKLE
TVISDLEVEHRLIKSKKWFIRYPIIKNIKNINIEYLLVATTRPETLLGDTALAINPKDDK
YNHLIGQSVICPIVNRIIPIIADHYADMNKDTGCVKITPGHDFNDYEVGQRHKLPMINIF
TFNGKIKSNFSIYDYQGSKSNFYDSSIPTEFQNLDILSARKKIIYEIEKLGLLEKIEECN
FFTPYSERSGVIIQPMLTNQWYLKTSHLSQSAIDVVREKKIKFIPNQYKSMYLSWMNNIE
DWCISRQLWWGHQIPVWYDDKKNIYVGHSEKKIREEYNISDDMILNQDNDVLDTWFSSGL
WTFSTLGWPEKTEFLKIFHSTDVLVSGFDIIFFWIARMIMLTMYLVKDSYGNPQIPFKDV
YITGLIRDEEGKKMSKSKGNVIDPIDMIDGISLNELIEKRTSNLLQPHLSQKIRYHTIKQ
FPNGISATGTDALRFTFSALASNTRDIQWDMNRLKGYRNFCNKLWNASRFVLKNTKDHDY
FNFSVNDNMLLINKWILIKFNNTVKSYRNSLDSYRFDIAANILYDFIWNVFCDWYLEFVK
SVIKSGSYQDIYFTKNVLIHVLELLLRLSHPIMPFITEAIWQRVKIIKHIKDRTIMLQSF
PEYNDQLFDKSTLSNINWIKKIIIFIRNTRSKMNISSTKLLSLFLKNINSEKKKVIQENK
FILKNIASLEKISILSKQDDEPCLSLKEIIDGVDILVPVLKAIDKEIELKRLNKEIEKIK
SKMLISEKKMSNQDFLSYAPKNIIDKEIKKLKSLNEIYLTLSQQLESLHDAFCKKNKIFN
sp|Q664P8|TAUB_YERPS Taurine import ATP-binding protein TauB OS=Yersinia pseudotuberculosis serotype I (strain IP32953) OX=273123 GN=tauB PE=3 SV=1
MLNVSGLWAEYQGKPALQDVSLQIASGQLVVVLGPSGCGKTTLLNLIAGFMTPSAGVITL
DNIPVSGPSAERGVVFQNEGLLPWRDVVSNVEFGLQLAGMSKEQRRVTALKMLNRVGLAG
FEHHFIWQLSGGMRQRVGIARALAVDPRLLLLDEPFGALDAFTREQMQELLLTIWRDTGK
QILLITHDIEEAVFLASELLLLSPGPGQVVERLSLNFGQRYAEGEPCRAIKSDPEFIARR
EYVLGKVFQQREVLI
sp|Q66K14|TBC9B_HUMAN TBC1 domain family member 9B OS=Homo sapiens OX=9606 GN=TBC1D9B PE=1 SV=3
MWLSPEEVLVANALWVTERANPFFVLQRRRGHGRGGGLTGLLVGTLDVVLDSSARVAPYR
ILHQTQDSQVYWTVACGSSRKEITKHWEWLENNLLQTLSIFDSEEDITTFVKGKIHGIIA
EENKNLQPQGDEDPGKFKEAELKMRKQFGMPEGEKLVNYYSCSYWKGRVPRQGWLYLTVN
HLCFYSFLLGKEVSLVVQWVDITRLEKNATLLFPESIRVDTRDQELFFSMFLNIGETFKL
MEQLANLAMRQLLDSEGFLEDKALPRPIRPHRNISALKRDLDARAKNECYRATFRLPRDE
RLDGHTSCTLWTPFNKLHIPGQMFISNNYICFASKEEDACHLIIPLREVTIVEKADSSSV
LPSPLSISTKSKMTFLFANLKDRDFLVQRISDFLQKTPSKQPGSIGSRKASVVDPSTESS
PAPQEGSEQPASPASPLSSRQSFCAQEAPTASQGLLKLFQKNSPMEDLGAKGAKEKMKEE
SWHIHFFEYGRGVCMYRTAKTRALVLKGIPESLRGELWLLFSGAWNEMVTHPGYYAELVE
KSTGKYSLATEEIERDLHRSMPEHPAFQNELGIAALRRVLTAYAFRNPTIGYCQAMNIVT
SVLLLYGSEEEAFWLLVALCERMLPDYYNTRVVGALVDQGIFEELTRDFLPQLSEKMQDL
GVISSISLSWFLTLFLSVMPFESAVVIVDCFFYEGIKVILQVALAVLDANMEQLLGCSDE
GEAMTMLGRYLDNVVNKQSVSPPIPHLRALLSSSDDPPAEVDIFELLKVSYEKFSSLRAE
DIEQMRFKQRLKVIQSLEDTAKRSVVRAIPVDIGFSIEELEDLYMVFKAKHLASQYWGCS
RTMAGRRDPSLPYLEQYRIDASQFRELFASLTPWACGSHTPLLAGRMFRLLDENKDSLIN
FKEFVTGMSGMYHGDLTEKLKVLYKLHLPPALSPEEAESALEAAHYFTEDSSSEASPLAS
DLDLFLPWEAQEALPQEEQEGSGSEERGEEKGTSSPDYRHYLRMWAKEKEAQKETIKDLP
KMNQEQFIELCKTLYNMFSEDPMEQDLYHAIATVASLLLRIGEVGKKFSARTGRKPRDCA
TEEDEPPAPELHQDAARELQPPAAGDPQAKAGGDTHLGKAPQESQVVVEGGSGEGQGSPS
QLLSDDETKDDMSMSSYSVVSTGSLQCEDLADDTVLVGGEACSPTARIGGTVDTDWCISF
EQILASILTESVLVNFFEKRVDIGLKIKDQKKVERQFSTASDHEQPGVSG
sp|Q8E4B4|TARI_STRA3 Ribitol-5-phosphate cytidylyltransferase OS=Streptococcus agalactiae serotype III (strain NEM316) OX=211110 GN=tarI PE=3 SV=1
MNIGVIFAGGVGRRMNTKGKPKQFLEVHGKPIIVHTIDIFQNTEAIDAVVVVCVSDWLDY
MNNLVERFNLTKVKAVVAGGETGQMSIFKGLEAAEQLATDDAVVLIHDGVRPLINEEVIN
ANIKSVKETGSAVTSVRAKETVVLVNDSSKISEVVDRTRSFIAKAPQSFYLSDILSVERD
AISKGITDAIDSSTLMGMYNRELTIVEGPYENIKITTPDDFYMFKALYDARENEQIYGM
sp|B3CQ06|SYS_WOLPP Serine–tRNA ligase OS=Wolbachia pipientis subsp. Culex pipiens (strain wPip) OX=570417 GN=serS PE=3 SV=1
MHDIEHIRKNPKGFEKAIKSRGVKEFTAKEILEIDHKKRSLTTKLQALNKQRNEVTEEIK
RLKMNKSPCEEQVKLSKSITSEIETISLKEQTEKNKLVDILSNLPNISAQNVPIGEDESS
NVEIRKYGKKRKFDFTPKFHYELGERLGLMDFEQAAKISGSRFTILKGQLAKLGRALINF
MLETHVNEFAYTEVYHPALVKNEAMYNVGQLPKFSDDSYLTTDKLRLIPTSEVVLTNLVA
DKIIEEKELPIRFTAYSECFRKEAGSAGRDTRGMIRQHQFGKVELVSITTEDQSKDELER
MTNAAEEILKKLELPYRIMLLCSGDMGFAAQKTYDIEVWLPEQNKYREISSCSNCGTFQA
RRMNTKYFLETDRKTKKYVHTLNGSALAIGRTIVAIMENYQNSDGSVTIPNVLQRYMSND
TVISKQ
sp|Q9ATB4|TAD2B_ARATH Transcriptional adapter ADA2b OS=Arabidopsis thaliana OX=3702 GN=ADA2B PE=1 SV=1
MGRSRGNFQNFEDPTQRTRKKKNAANVENFESTSLVPGAEGGGKYNCDYCQKDITGKIRI
KCAVCPDFDLCIECMSVGAEITPHKCDHPYRVMGNLTFPLICPDWSADDEMLLLEGLEIY
GLGNWAEVAEHVGTKSKEQCLEHYRNIYLNSPFFPLPDMSHVAGKNRKELQAMAKGRIDD
KKAEQNMKEEYPFSPPKVKVEDTQKESFVDRSFGGKKPVSTSVNNSLVELSNYNQKREEF
DPEYDNDAEQLLAEMEFKENDTPEEHELKLRVLRIYSKRLDERKRRKEFIIERNLLYPNP
FEKDLSQEEKVQCRRLDVFMRFHSKEEHDELLRNVVSEYRMVKRLKDLKEAQVAGCRSTA
EAERYLGRKRKRENEEGMNRGKESGQFGQIAGEMGSRPPVQASSSYVNDLDLIGFTESQL
LSESEKRLCSEVKLVPPVYLQMQQVMSHEIFKGNVTKKSDAYSLFKIDPTKVDRVYDMLV
KKGIAQL
sp|Q83JA5|SYW_SHIFL Tryptophan–tRNA ligase OS=Shigella flexneri OX=623 GN=trpS PE=3 SV=1
MTKPIVFSGAQPSGELTIGNYMGALRQWVNMQDDYHCIYCIVDQHAITVRQDAQKLRKAT
LDTLALYLACGIDPEKSTIFVQSHVPEHAQLGWALNCYTYFGELSRMTQFKDKSARYAEN
INAGLFGYPVLMAADILLYQTNLVPVGEDQKQHLELSRDIAQRFNALYGEIFKVPEPFIP
KSGARVMSLLEPTKKMSKSDDNRNNVIGLLEDPKSVVKKIKRAVTDSDEPPVVRYDVQNK
AGVSNLLDILSAVTGQSIPELEKQFEGKMYGHLKGEVADAVSGMLTELQERYHRFRNDEA
FLQQVMKDGAEKASVHASRTLKAVYEAIGFVAKP
sp|P14213|TAC1_TACTR Tachyplesin-1 OS=Tachypleus tridentatus OX=6853 PE=1 SV=2
MKKLVIALCLMMVLAVMVEEAEAKWCFRVCYRGICYRRCRGKRNEVRQYRDRGYDVRAIP
EETFFTRQDEDEDDDEE
sp|Q7SZM9|TB1RA_XENLA F-box-like/WD repeat-containing protein TBL1XR1-A OS=Xenopus laevis OX=8355 GN=tbl1xr1-a PE=1 SV=1
MSISSDEVNFLVYRYLQESGFSHSAFTFGIESHISQSNINGALAPPAALISIIQKGLQYV
EAEVSINEDGTLFDGRPIESLSLIDAVMPDVVQTRQQAYRDKLAQQQTAAAAAAAAAAAA
TPNNQQPPAKNGENTANGEENGGHALANNHTDMMEVDGDVEIPSSKAVVLRGHESEVFIC
AWNPVSDLLASGSGDSTARIWNLSENSTSGSTQLVLRHCIREGGQDVPSNKDVTSLDWNS
EGTLLATGSYDGFARIWTKDGNLASTLGQHKGPIFALKWNKKGNFILSAGVDKTTIIWDA
HTGEAKQQFPFHSAPALDVDWQSNNTFASCSTDMCIHVCKLGQDRPIKTFQGHTNEVNAI
KWDPTGNLLASCSDDMTLKIWSMKHDTCVHDLQAHNKEIYTIKWSPTGPGTNNPNANLML
ASASFDSTVRLWDVDRGICIHTLTKHQEPVYSVAFSPDGRYLASGSFDKCVHIWNTQTGA
LVHSYRGTGGIFEVCWNAAGDKVGASASDGSVCVLDLRK
sp|Q9FGE9|TBL12_ARATH Protein trichome birefringence-like 12 OS=Arabidopsis thaliana OX=3702 GN=TBL12 PE=2 SV=1
MELGSRRIYTTMPSKLRSSSSLLPRILLLSLLLLLFYSLILRRPITSNIASPPPCDLFSG
RWVFNPETPKPLYDETCPFHRNAWNCLRNKRDNMDVINSWRWEPNGCGLSRIDPTRFLGM
MRNKNVGFVGDSLNENFLVSFLCILRVADPSAIKWKKKKAWRGAYFPKFNVTVAYHRAVL
LAKYQWQARSSAEANQDGVKGTYRVDVDVPANEWINVTSFYDVLIFNSGHWWGYDKFPKE
TPLVFYRKGKPINPPLDILPGFELVLQNMVSYIQREVPAKTLKFWRLQSPRHFYGGDWNQ
NGSCLLDKPLEENQLDLWFDPRNNGVNKEARKINQIIKNELQTTKIKLLDLTHLSEFRAD
AHPAIWLGKQDAVAIWGQDCMHWCLPGVPDTWVDILAELILTNLKTE

结果

sp|Q664P8|TAUB_YERPS Taurine import ATP-binding protein TauB OS=Yersinia pseudotuberculosis serotype I (strain IP32953) OX=273123 GN=tauB PE=3 SV=1
sp|Q66K14|TBC9B_HUMAN TBC1 domain family member 9B OS=Homo sapiens OX=9606 GN=TBC1D9B PE=1 SV=3
sp|Q8K9I1|SYV_BUCAP Valine–tRNA ligase OS=Buchnera aphidicola subsp. Schizaphis graminum (strain Sg) OX=198804 GN=valS PE=3 SV=1
sp|Q664P8|TAUB_YERPS Taurine import ATP-binding protein TauB OS=Yersinia pseudotuberculosis serotype I (strain IP32953) OX=273123 GN=tauB PE=3 SV=1
sp|Q66K14|TBC9B_HUMAN TBC1 domain family member 9B OS=Homo sapiens OX=9606 GN=TBC1D9B PE=1 SV=3
sp|Q8E4B4|TARI_STRA3 Ribitol-5-phosphate cytidylyltransferase OS=Streptococcus agalactiae serotype III (strain NEM316) OX=211110 GN=tarI PE=3 SV=1
sp|B3CQ06|SYS_WOLPP Serine–tRNA ligase OS=Wolbachia pipientis subsp. Culex pipiens (strain wPip) OX=570417 GN=serS PE=3 SV=1
sp|Q9ATB4|TAD2B_ARATH Transcriptional adapter ADA2b OS=Arabidopsis thaliana OX=3702 GN=ADA2B PE=1 SV=1
sp|Q83JA5|SYW_SHIFL Tryptophan–tRNA ligase OS=Shigella flexneri OX=623 GN=trpS PE=3 SV=1
sp|P14213|TAC1_TACTR Tachyplesin-1 OS=Tachypleus tridentatus OX=6853 PE=1 SV=2
sp|Q7SZM9|TB1RA_XENLA F-box-like/WD repeat-containing protein TBL1XR1-A OS=Xenopus laevis OX=8355 GN=tbl1xr1-a PE=1 SV=1
sp|Q9FGE9|TBL12_ARATH Protein trichome birefringence-like 12 OS=Arabidopsis thaliana OX=3702 GN=TBL12 PE=2 SV=1

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值