官网:PLINK 1.9
plink --bfile plink --flip flip.txt --make-bed --out test
当不同数据合并时,如果一个数据使用正链,另外一个数据使用反链,就会导致合并数据出现问题,报错。
这种时候方案一:舍弃所有不能合并位点;方案二:尝试翻转分型,统一链,然后合并。
如:分型数据一中SNP1:A T ; 分型数据二中SNP1:C G 类型,均可以通过链的翻转实现数据合并
数据合并报错会提示:多等位 3+ alleles
plink --allow-extra-chr --chr-set 95 --bfile temp --bmerge temp2.bed temp2.bim temp2.fam --make-bed --out all
Error: 7327 variants with 3+ alleles present.
* If you believe this is due to strand inconsistency, try --flip with
all-merge.missnp.
(Warning: if this seems to work, strand errors involving SNPs with A/T or C/G
alleles probably remain in your data. If LD between nearby SNPs is high,
--flip-scan should detect them.)
* If you are dealing with genuine multiallelic variants, we recommend exporting
that subset of the data to VCF (via e.g. '--recode vcf'), merging with
another tool/script, and then importing the result; PLINK is not yet suited
to handling them.
See https://www.cog-genomics.org/plink/1.9/data#merge3 for more discussion.
解决办法
1. 基于二进制文件合并,会自动输出不能合并的位点:all-merge.missnp,以输出文件名+“-merge.missnp”,内含不能合并的位点
2.翻转不能合并的位点
准备需要翻转位点明细,含一列:SNPID,可直接翻转上一步输出的文件*merge.missnp
plink --allow-extra-chr --chr-set 95 --file temp --flip all-merge.missnp --recode --out flip
3.继续合并数据
plink --allow-extra-chr --chr-set 95 --bfile flip --bmerge temp2.bed temp2.bim temp2.fam --make-bed --out all