使用Interpro数据库,可以将蛋白质序列进行家族分类,预测其结构域和重要位点。Interpro综合了多种不同的数据库来构成一个综合的Interpro数据库。这些数据库有:PROSITE.HAMAP,Pfam,PRINTS,ProPom,SMART/TIGRFAMs,PIRSF,SUPERFAMILY,CATH-Gene3D,PANTHER
方法1网页版
http://www.ebi.ac.uk/interpro/
将序列粘贴到输入框中进行Interpro注释。
优点:方便。。。
缺点:输入必须为蛋白质序列;InterProScan每次查询一次性最多能比对25条蛋白质序列。
方法2使用EBI提供的脚本程序进行远程比对
EBI:The European Bioinformatics Institute
推荐使用EBI提供的perl程序来进行Interpro注释。程序能将序列发送到官方服务器进行InterPro注释,再将结果返回本地。
脚本的下载网页:http://www.ebi.ac.uk/Tools/Webservices/services/pfa/iprscan5_rest
在这里,有perl,Python和Ruby程序各一支。分别是:iprscan_lwp.pl,iprscan_urllib2.py和iprscan_net_http.rb
[Required]
seqFile : file : query sequence ("-" for STDIN, @filename for
identifier list file)
[Optional]
--appl : str : Comma separated list of signature methods to run,
see --paramDetail appl.
--goterms : : retrieve GO terms
--nogoterms : : do not retrieve GO terms
--pathways : : retrieve pathway terms
--nopathways : : do not retrieve pathway terms
--multifasta : : treat input as a set of fasta formatted sequences
[General]
-h, --help : : prints this help text
--async : : forces to make an asynchronous query
--email : str : e-mail address
--title : str : title for job
--status : : get job status
--resultTypes : : get available result types for job
--polljob : : poll for the status of a job
--jobid : str : jobid that was returned when an asynchronous job
was submitted.
--outfile : str : file name for results (default is jobid;
"-" for STDOUT)
--useSeqId : : use sequence identifiers for output filenames.
Only available in multifasta or list file modes.
--maxJobs : int : maximum number of concurrent jobs. Only
available in multifasta or list file modes.
--outformat : str : result format to retrieve
--params : : list input parameters
--paramDetail : str : display details for input parameter
--quiet : : decrease output
--verbose : : increase output
Synchronous job:
The results/errors are returned as soon as the job is finished.
Usage: iprscan5_lwp.pl --email <your@email> [options...] seqFile
Returns: results as an attachment
优点:
缺点:不能进行核苷酸序列的注释
$perl iprscan5_lwp.pl --email fsczhenjiang@foxmail.com test.fa
结果:
JobId: iprscan5-R20160605-043400-0109-32295822-es
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
FINISHED
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.out.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.log.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.tsv.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.xml.xml
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.htmltarball.html.tar.gz
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.gff.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.svg.svg
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.sequence.txt