Bioinformatic
xxxxy314
这个作者很懒,什么都没留下…
展开
-
本地blast
目前本地blast的版本:下载:wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/解压:tar zxvpf ncbi-blast-2.2.31+-x64-linux.tar.gz之后生成一个文件夹:ncbi-blast-2.2.31+Under bash, the followi原创 2015-10-16 15:34:01 · 1022 阅读 · 0 评论 -
命令行整理
提取fastq文件中的一部分序列来进行测试:zcat ERR022075pe.fasta.gz | head 1500000 > subset.fasta原创 2015-10-07 12:48:49 · 344 阅读 · 0 评论 -
Read alignment后的质量控制
Drop low-quality alignments: Each alignment is given a quality score (MAPQ) which is equivalent to Phred scores (with respect to determining relative quality of different alignments). You can screen f原创 2015-10-07 13:46:42 · 847 阅读 · 0 评论 -
Genome denovo assembly using velvet
Velvet is denovo assembler suitable for small genomes. It is based on deBruijn graph andwe must define kmer length when using it.In practice you should try several assemblies with different values转载 2015-10-07 12:43:35 · 351 阅读 · 0 评论 -
NGS library construction(未完待续)
基本原理:Fundamental to NGS library construction is the preparation of the nucleic acid target, RNA or DNA, into a form that is compatible with the sequencing system to be used (Figure 1).Figure 1.原创 2015-10-07 08:54:28 · 1055 阅读 · 0 评论 -
question(未完待续)
Why Call SNPs?How can we tell which mismatches represent real mutations and which are just noise?原创 2015-10-06 14:25:45 · 290 阅读 · 0 评论 -
VCFtools quality filtering
VCFtools provide a wide range of functionality for the filtering, analysis and transformation of vcf files. Typically, SNPs of quality < 20 and read depth < 20 are filtered out as they are considere转载 2015-10-07 14:15:06 · 807 阅读 · 0 评论 -
example 收集(未完待续)
Go注释的例子:In an example of GO annotation, the gene product "cytochrome c" can be described by the Molecular Function term "oxidoreductase activity", the Biological Process terms "oxidative phospho原创 2015-10-06 14:06:20 · 290 阅读 · 0 评论 -
质量值体系 Phred33 和 Phred 64 的由来 及其在质量控制中的实际影响
最近在学习质控知识时, 对于质量值体系及转换产生了一些疑问, 作了一些尝试, 趁集群故障, 在此总结一下质量值体系相比之前培训时所学的质控内容, (我拿到的) 流程中还多了一步 phred33to64, 也就是把 .fastq 格式的数据从 Phred33 质量值体系转换为 Phred64 质量体系, 于是先补充学习了下质量值体系:首先要从质量值说起, 测序仪器下机数转载 2015-10-07 14:36:54 · 8396 阅读 · 0 评论 -
链特异性转录组测序
常规转录组测序首先将mRNA片段化,然后采用随机引物进行cDNA双链的合成。因此,常规转录组在测序序列中不能提供链方向的特征信息,很难确定反义转录本,且不能真实的反映转录情况。链特异性转录组测序(ssRNA-SEQ)是指在构建测序文库时,利用高保真Taq酶将mRNA链的方向信息保存到测序文库中。测序后的数据分析可确定转录本是来自正义还是反义DNA链。与普通转录组测序相比,它更能准确地统计转录本转载 2015-10-06 21:50:56 · 3313 阅读 · 0 评论 -
关于Bowtie
首先来说下以参考基因组建索引:下面是官网上的一个介绍bowtie-build builds a Bowtie index from a set of DNA sequences. bowtie-build outputs a set of 6 files with suffixes .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, a原创 2015-10-06 14:32:26 · 1724 阅读 · 0 评论 -
关于Trimmomatic
标题:Trimmomatic: A flexible trimmer for Illumina Sequence DataTrimmomatic as a more flexible and efficient pre-processing tool, which could correctly handle paired-end data.下载地址:http://www.usad原创 2015-10-05 19:21:44 · 1344 阅读 · 0 评论 -
Using DAVID for GO and pathway enrichment analysi
网址:https://david.ncifcrf.gov/步骤:Upload or paste a gene listTo start DAVID, first click on "Functional Annotation" under "Shortcut to David tools" at the left of the home page. This will take转载 2015-10-06 09:52:28 · 864 阅读 · 0 评论 -
gene ID转换
Retrieve/ID mappingEnter identifiers, separated by a space or a new line, into the form field, for example:P31946 P62258ALBU_HUMANEFTU_ECOLIIf you need to convert to another identifi原创 2015-10-05 21:04:30 · 1750 阅读 · 0 评论 -
GO 和 KEGG 注释之前,为什么要先进行序列比对(BLAST)?
在进行功能注释和通路注释之前,我们会先将差异蛋白与合适的数据库中的蛋白序列进行比对。目的一:很多物种目前研究的程度还很有限,关于这些物种的蛋白注释信息还很不完善。根据相似性原理,具有相似序列的蛋白可能也具有相似的功能,因此,我们可以将 BLAST 所得的同源蛋白的注释信息转嫁到我们关注的差异蛋白上,来完成对于差异蛋白尤其是研究程度不足的物种的差异蛋白的注释。目的二:我们在查库过程中,为了得到更多的转载 2015-10-05 19:03:42 · 10775 阅读 · 0 评论 -
Extract lowercase masked FASTA from a BLAST database with masking information
If a BLAST database contains masking information, this can be extracted using the blastdbcmd options –db_mask and –mask_sequence as follows:$ blastdbcmd -info -db mask-data-dbDatabase: Mask data te转载 2015-10-17 09:46:40 · 626 阅读 · 0 评论 -
待整理
Seed-and-extend alignersAn alignment strategy that first builds a hash table containing the location of each k-mer (seed) within thereference genome. These algorithms then extend these seeds原创 2015-10-15 13:54:18 · 310 阅读 · 0 评论