Using the current 1000 genomes reference (Phase4 reference) to use BreakSeq2 to perform SV calling

Q:
We have started a Cloud-based Cancer SV Calling project and would like to use BreakSeq2 to perform SV calling, but would like to use the current 1000 genomes reference (Phase4 reference). Because Breakseq2 relies on the coordinates in the breakpoint library GFF, we were hoping that we could either obtain an updated breakpoint library or some advice on the feasibility of using coordinate liftover (via the available hg19 to Hg38 UCSC chain files) to update the coordinates in the GFF inside the latest library hosted on your lab website at:

http://sv.gersteinlab.org/phase1bkpts/breakseq2_bplib_20150129.zip

We are under a time constraint with regard to the Cloud Compute funding, so we would very grateful if you could reply back soon.

A:
I think the best option right now would be to lift over the coordinates to hg38. Both the GFF and the INS files need to be lifted over (you can use CrossMap which supports GFF). After the liftover, you can check to ensure that the SV lengths were lifted correctly, it might be good to ignore SVs whose lengths after the liftover changed. Note that for the INS file, you will need to write a script to liftover the coordinates in the read-name. You can check out the example on the BreakSeq2 page (BreakSeq2) for how to run from GFF (you will need both the GFF and the INS file). Hope that helps.

Using use the current 1000 genomes reference (Phase4 reference) to use BreakSeq2 to perform SV calling | Gerstein Lab FAQs

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值