vcftools-linux-conda安装、使用

shell conda命令安装:

conda install -c bioconda vcftools

vcftools文档:

OUTPUT FILE OPTIONS

--out <output_prefix>

This option defines the output filename prefix for all files generated by vcftools. For example, if <prefix> is set to output_filename, then all output files will be of the form output_filename.*** . If this option is omitted, all output files will have the prefix "out." in the current working directory.

--stdout
-c

These options direct the vcftools output to standard out so it can be piped into another program or written directly to a filename of choice. However, a select few output functions cannot be written to standard out.

--temp <temporary_directory>

This option can be used to redirect any temporary files that vcftools creates into a specified directory.



ALLELE FILTERING

--maf <float>
--max-maf <float>

Include only sites with a Minor Allele Frequency greater than or equal to the "--maf" value and less than or equal to the "--max-maf" value. One of these options may be used without the other. Allele frequency is defined as the number of times an allele appears over all individuals at that site, divided by the total number of non-missing alleles at that site.




OUTPUT VCF FORMAT

--recode
--recode-bcf

These options are used to generate a new file in either VCF or BCF from the input VCF or BCF file after applying the filtering options specified by the user. The output file has the suffix ".recode.vcf" or ".recode.bcf". By default, the INFO fields are removed from the output file, as the INFO values may be invalidated by the recoding (e.g. the total depth may need to be recalculated if individuals are removed). This behavior may be overriden by the following options. By default, BCF files are written out as BGZF compressed files.

--recode-INFO <string>
--recode-INFO-all

These options can be used with the above recode options to define an INFO key name to keep in the output file. This option can be used multiple times to keep more of the INFO fields. The second option is used to keep all INFO values in the original file.

根据ALLELE FREQUENCY过滤,0.001 < AF < 0.5,保留INFO域信息:

$ vcftools --vcf ALL.chr22.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf --maf 0.001 --max-maf 0.5 --recode --recode-INFO-all --out chr22_af_filter

在这里插入图片描述在这里插入图片描述

输出文件:
在这里插入图片描述
注意:

–maf,–max-maf Minor Allele Frequency二等位基因频率进行过滤,常为–maf 0.05,保留大于0.05的。
–non-ref-af,–non-ref-ac… 保留都是ALT变异的位点。
–mac INT,–max-mac 保留Minor Allel
Count数大于INT数的位点
–min-alleles 2, --max-alleles 2筛选保留含有2个ALT变异的位点。常用。

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值