Long Read Sequencing Technology to Solve Complex Genomic Regions Assembly in Plants

Long Read Sequencing Technology to Solve Complex Genomic Regions Assembly in Plants

Arnaud Bellec1 *, Audrey Courtial1 , Stephane Cauet1 , Nathalie Rodde1 , Sonia Vautrin1 , Genseric Beydon1 , Nadege Arnal1 , Nadine Gautier1 , Joelle Fourment1 , Elisa Prat1 , William Marande1 , Yves Barriere2 and Helene Berges1 1 French Plant Genomic Center, Centre National des Ressources Génomiques Végétales, INRA-CNRGV, Castanet-Tolosan, France 2 INRA, UR889, Unité de Génétique et d'Amélioration des Plantes Fourragères, 86600 Lusignan, France

Abstract

Background:

Numerous completed or on-going whole genome sequencing projects have highlighted the fact that obtaining a high quality genome sequence is necessary to address comparative genomics questions such as structural variations among genotypes and gain or loss of specific function. Despite the spectacular progress that has been made in sequencing technologies, obtaining accurate and reliable data is still a challenge, both at the whole genome scale and when targeting specific genomic regions. These problems are even more noticeable for complex plant genomes. Most plant genomes are known to be particularly challenging due to their size, high density of repetitive elements and various levels of ploidy. To overcome these problems, we have developed a strategy to reduce genome complexity by using the large insert BAC libraries combined with next generation sequencing technologies. Results: We compared two different technologies (Roche-454 and Pacific Biosciences PacBio RS II) to sequence pools of BAC clones in order to obtain the best quality sequence. We targeted nine BAC clones from different species (maize, wheat, strawberry, barley, sugarcane and sunflower) known to be complex in terms of sequence assembly. We sequenced the pools of the nine BAC clones with both technologies. We compared assembly results and highlighted differences due to the sequencing technologies used. Conclusions: We demonstrated that the long reads obtained with the PacBio RS II technology serve to obtain a better and more reliable assembly, notably by preventing errors due to duplicated or repetitive sequences in the same region.

许多已完成或正在进行的全基因组测序项目都强调了这样一个事实,即获得高质量的基因组序列是解决比较基因组学问题的必要条件,如基因型之间的结构变异和特定功能的获得或丧失。
尽管测序技术已经取得了惊人的进步,但无论是在整个基因组规模上还是在针对特定基因组区域上,获取准确可靠的数据仍然是一个挑战。
对于复杂的植物基因组来说,这些问题更加明显。
大多数植物基因组由于其大小、高密度的重复元件和不同程度的倍性而被认为是特别具有挑战性的。
为了克服这些问题,我们开发了一种策略,通过使用大插入BAC文库结合下一代测序技术来降低基因组的复杂性。
结果:我们比较了两种不同的技术(roch -454和Pacific Biosciences PacBio RS II)对BAC克隆的序列库,以获得最佳质量的序列。
我们选取了9个BAC克隆,这些克隆来自不同的物种(玉米、小麦、草莓、大麦、甘蔗和向日葵),已知在序列组装方面比较复杂。
我们用两种技术对九个BAC克隆的池进行了测序。
我们比较了组装结果,并强调了由于所使用的测序技术的差异。
结论:我们证明了使用PacBio RS II技术获得的长序列可以获得更好、更可靠的装配,特别是防止了由于相同区域重复或重复的序列造成的错误。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

wangchuang2017

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值