PennCNV 包中的联合 CNV 调用算法旨在从父母-子女三元组中调用 CNV,并且比前面的 Trio 调用教程部分中描述的三元组调用算法性能有所提高。有关联合调用算法的详细信息,请参阅 NAR 中的 Wang 等人的论文。(当前的联合调用算法已经被重新编写,并被直接合并到主 check _ cnv.pl
程序中。)与使用基于个体的 CNV 调用的后验证的三重调用算法不同,联合调用算法在一个步骤中为一个家庭中的三个个体生成 CNV 调用。
联合调用算法比目前基于家族的 CNV 调用具有更好的性能,特别是在解决正确的 CNV 边界和减少非常小(< 10个 SNP) CNV 调用的假阴性率方面。然而,这是非常缓慢的,可能需要几个小时的单一三个基因型在550K 标记。要使用这个新算法,用户可以在命令行中指定--joint
参数,而不是--trio
参数。例如:
[kaiwang@cc penncnv]$ detect_cnv.pl -joint -hmm lib/hh550.hmm -pfb lib/hh550.hg18.pfb sample1.txt sample2.txt sample3.txt -out sampleall.jointcnv
NOTICE: Reading marker coordinates and population frequency of B allele (PFB) from lib/hh550.hg18.pfb ... Done with 566108 records (178 records in chr M,XY were discarded)
NOTICE: Reading LRR and BAF values for from sample1.txt ... Done with 561288 records in 24 chromosomes (178 records are discarded due to lack of PFB information for the markers)
NOTICE: Data from chromosome X,Y will not be used in analysis
NOTICE: Median-adjusting LRR values for all markers by -0.0184
NOTICE: Reading LRR and BAF values for from sample2.txt ... Done with 561288 records in 24 chromosomes (178 records are discarded due to lack of PFB information for the markers)
NOTICE: Data from chromosome X,Y will not be used in analysis
NOTICE: Median-adjusting LRR values for all markers by 0.0233
NOTICE: Reading LRR and BAF values for from sample3.txt ... Done with 561288 records in 24 chromosomes (178 records are discarded due to lack of PFB information for the markers)
NOTICE: Data from chromosome X,Y will not be used in analysis
NOTICE: Median-adjusting LRR values for all markers by -0.0084
NOTICE: Calling CNVs in chromosome 1 with 42075 markers
NOTICE: Finished recursion cycle 1000 in Viterbi algorithm
NOTICE: Finished recursion cycle 2000 in Viterbi algorithm
NOTICE: Finished recursion cycle 3000 in Viterbi algorithm
NOTICE: Finished recursion cycle 4000 in Viterbi algorithm
NOTICE: Finished recursion cycle 5000 in Viterbi algorithm
NOTICE: Finished recursion cycle 6000 in Viterbi algorithm
正如我们从上面的命令行中看到的,与-trio
参数不同,联合调用算法不需要由基于单个调用算法生成的 CNV 文件作为输入文件。
联合调用算法只支持 Trios。对于复杂的核心家庭,最好使用本教程中 Trio 调用部分中描述的-trio
和-quartet
操作进行处理。