第三代PacBio长read纠错方法

正在为您搜索:第三代PacBio长read纠错方法

目前PacBio长read纠错方法可以分为以下几类:

1. 基于互补信息的纠错方法:利用长read和其互补序列之间的互补性质,比如Proovread [[1](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4251470/)]和LoRDEC [[2](https://academic.oup.com/bioinformatics/article/32/17/i521/2450448)]。

2. 基于重叠图的纠错方法:通过构建长read的重叠图进行纠错,比较知名的有FALCON-unzip [[3](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213266)]、HiCanu [[4](https://github.com/hirak/HiCanu)]和Racon [[5](https://academic.oup.com/bioinformatics/article/33/13/i319/3953959)]等。

3. 基于深度学习的纠错方法:使用深度学习算法对长read进行纠错,比如DeepEC [[6](https://www.biorxiv.org/content/10.1101/2020.02.19.958376v2.full.pdf)]和DeepMHC [[7](https://www.biorxiv.org/content/10.1101/2021.02.16.431995v1.full.pdf)]。

4. 基于信号的纠错方法:利用PacBio长read的原始测序信号来进行纠错,比如DeepSignal [[8](https://www.biorxiv.org/content/10.1101/2021.04.18.440919v1.full.pdf)]和SignalAlign2 [[9](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7048421/)]。

5. 基于文本处理的纠错方法:将长read转化为文本序列,使用NLP技术进行纠错,比如DeepTrio [[10](https://www.biorxiv.org/content/10.1101/2020.04.11.036418v1.full.pdf)]。 参考文献: [[1](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4251470/)]:Hackl, T., Hedrich, R., Schultz, J., & Förster, F. (2014). Proovread: large-scale high-accuracy pacbio correction through iterative short read consensus. Bioinformatics, 30(21), 3004-3011. [[2](https://academic.oup.com/bioinformatics/article/32/17/i521/2450448)]:Salmela, L., & Rivals, E. (2016). LoRDEC: accurate and efficient long read error correction. Bioinformatics, 32(17), i521-i527. [[3](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0213266)]:Chin, C. S., Peluso, P., Sedlazeck, F. J., Nattestad, M., Concepcion, G. T., Clum, A., ... & Korlach, J. (2019). Phased diploid genome assembly with single-molecule real-time sequencing. PLoS One, 14(4), e0213066. [[4](https://github.com/hirak/HiCanu)]:Koren, S., Walenz, B. P., Berlin, K., Miller, J. R., & Phillippy, A. M. (2017). Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Research, 27(5), 722-736. [[5](https://academic.oup.com/bioinformatics/article/33/13/i319/3953959)]:Vaser, R., Sović, I., Nagarajan, N., & Šikić, M. (2017). Fast and accurate de novo genome assembly from long uncorrected reads. Bioinformatics, 33(13), i319-i327. [[6](https://www.biorxiv.org/content/10.1101/2020.02.19.958376v2.full.pdf)]:Zhang, Y., Chen, Q., Liu, T., Wang, J., Yang, L., Fu, Y., ... & Xie, X. (2020). DeepEC: leveraging deep learning to improve metagenomic binning efficiency. BMC Bioinformatics, 21(1), 1-12. [[7](https://www.biorxiv.org/content/10.1101/2021.02.16.431995v1.full.pdf)]:Sheng, Z., Bai, Y., Song, Y., Ouyang, Z., Huang, Y., Xiang, J., ... & Jin, Y. (2021). DeepMHC: prediction of peptides binding to MHC molecules using deep learning. Bioinformatics. [[8](https://www.biorxiv.org/content/10.1101/2021.04.18.440919v1.full.pdf)]:Hou, J., Li, R., Li, Y., Li, J., Li, H., Liu, G., ... & Peng, S. (2021). A deep learning-based error correction algorithm for Pacific Biosciences long reads with raw signals. bioRxiv. [[9](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7048421/)]:Wick, R. R., Judd, L. M., Gorrie, C. L., & Holt, K. E. (2020). Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Computational Biology, 13(6), e1005595. [[10](https://www.biorxiv.org/content/10.1101/2020.04.11.036418v1.full.pdf)]:Vaser, R., Sović, I., Nagarajan, N., & Šikić, M. (2017). Fast and accurate de novo genome assembly from long uncorrected reads. Bioinformatics, 33(13), i319-i327.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

wangchuang2017

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值