CNV
Gusnanto A, Taylor C C, Nafisah I, et al. Estimating optimal window size for analysis of low-coverage next-generation sequence data[J]. Bioinformatics, 2014, 30(13): 1823-1829.
hgvs
http://varnomen.hgvs.org/recommendations/general/
“c.” for a coding DNA reference sequence
“g.” for a linear genomic reference sequence
“m.” for a mitochondrial DNA reference sequence
“n.” for a non-coding DNA reference sequence
“o.” for a circular genomic reference sequence
“p.” for a protein reference sequence
“r.” for an RNA reference sequence (transcript)
cosmic_hotspot
Variants listed in COSMIC were considered hotspot point mutations if they presented with >=5 mentions
For mutations found in at least 50 samples according to the COSMIC database (“hotspots”)
Analysis of Tumor Mutational Burden with TruSight® Tumor 170
You could find this info from CosmicCodingMuts.vcf(Download from COSMIC).
##INFO=<ID=CNT,Number=1,Type=Integer,Description="How many samples have this mutation">
Sukhai M A, Misyura M, Thomas M, et al. Somatic Tumor Variant Filtration Strategies to Optimize Tumor-Only Molecular Profiling Using Targeted Next-Generation Sequencing Panels[J]. The Journal of Molecular Diagnostics, 2019, 21(2): 261-273.
Distinguish somatic and germline
-
VAF(20% for small insertions/deletions, 30% for SNVs)
Mandelker D, Donoghue M T A, Talukdar S, et al. Germline-Focused Analysis of Tumour-Only Sequencing: Recommendations from the ESMO Precision Medicine Working Group[J]. Annals of Oncology, 2019.
-
machine learning approach
Wood D E, White J R, Georgiadis A, et al. A machine learning approach for somatic mutation discovery[J]. Science translational medicine, 2018, 10(457): eaar7939. Sun J X, He Y, Sanford E, et al. A computational approach to distinguish somatic vs. germline origin of genomic alterations from deep sequencing of cancer specimens without a matched normal[J]. PLoS computational biology, 2018, 14(2): e1005965.
-
同时对区分somatic mutations 和germline variants 也作出了相关解释。
Montgomery N D, Selitsky S R, Patel N M, et al. Identification of Germline Variants in Tumor Genomic Sequencing Analysis[J]. The Journal of molecular diagnostics: JMD, 2018, 20(1): 123-125.
Indel_realigner
-
ABRA (https://github.com/mozack/abra2)
Mose L E, Wilkerson M D, Hayes D N, et al. ABRA: improved coding indel detection via assembly-based realignment[J]. Bioinformatics, 2014, 30(19): 2813-2815.
-
SRMA (https://github.com/nh13/SRMA)
Homer N, Nelson S F. Improved variant discovery through local re-alignment of short-read next-generation sequencing data using SRMA[J]. Genome biology, 2010, 11(10): R99.
MSK
Dataset: http://www.cbioportal.org/study?id=msk_impact_2017
Pipeline: https://impact-pipeline.readthedocs.io/en/latest/index.html#
Zehir A, Benayed R, Shah R H, et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients[J]. Nature medicine, 2017, 23(6): 703.
FoundationOne
Chalmers Z R, Connelly C F, Fabrizio D, et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden[J]. Genome medicine, 2017, 9(1): 34.
FDA. FoundationOne CDx: Summary of Safety and Effectiveness Data (SSED)(https://www.accessdata.fda.gov/cdrh_docs/pdf17/P170019B.pdf)
common snp
common SNP is one that has at least one 1000Genomes population with a MAF >= 1% and for which 2 or more founders contribute to that minor allele frequency
ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/
hg19—gtf
1:下载knownGene.txt.gz
ftp://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/knownGene.txt.gz
2:下载genePredToGtf:
http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/genePredToGtf
3:gzip -d refGene.txt.gz
4:cut -f 2- refGene.txt > refGene.input
5:genePredToGtf file refGene.input hg19refGene.gtf
6:cat hg19refGene.gtf | sort -k1,1 -k4,4n > hg19refGene.gtf.sorted
参考链接:
http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format
dbSNP
基于hg19
ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/00-All.vcf.gz
基于hg38 ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/00-All.vcf.gz
gnomAD
基于hg19
http://hgdownload.cse.ucsc.edu/gbdb/hg19/gnomAD/vcf/
clinvar
ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/