The Third Revolution in Sequencing Technology

The Third Revolution in Sequencing Technology

Abstract

Forty years ago the advent of Sanger sequencing was revolutionary as it allowed complete genome sequences to be deciphered for the first time.

A second revolution came when next-generation sequencing (NGS) technologies appeared, which made genome sequencing much cheaper and faster.

However, NGS methods have several drawbacks and pitfalls, most notably their short reads. Recently, third-generation/long-read methods appeared, which can produce genome assemblies of unprecedented quality.

Moreover, these technologies can directly detect epigenetic modifications on native DNA and allow whole-transcript sequencing without the need for assembly. This marks the third revolution in sequencing technology.

Here we review and compare the various long-read methods.

We discuss their applications and their respective strengths and weaknesses and provide future perspectives.

Keywords

next-generation sequencing

third-generation sequencing

long-read sequencing

single-molecule real-time sequencing

nanopore sequencing

synthetic long-read sequencing

Highlights

Long-read/third-generation sequencing technologies are causing a new revolution in genomics as they provide a way to study genomes, transcriptomes, and metagenomes at an unprecedented resolution.

SMRT and nanopore sequencing allow for the first time the direct study of different types of DNA base modifications.

Moreover, nanopore technology can sequence directly RNA and identify RNA base modifications.

Owing to the portability of the MinION and the existence of extremely simple library preparation methods, nanopore technology allows the performance of high-throughput sequencing for the first time in the field and at remote places. This is of tremendous importance for the survey of outbreaks in developing countries.

测序技术的第三次革命
摘要

40年前桑格测序的出现是革命性的,它第一次允许完整的基因组序列被破译。

第二次革命是新一代测序(NGS)技术的出现,它使基因组测序更加便宜和快速。

然而,NGS方法有几个缺点和缺陷,最明显的是它们的短read。
最近,第三代/长读的方法出现了,它可以产生前所未有的高质量 基因组组装。

此外,这些技术可以直接检测到天然DNA的表观遗传修饰,并允许不需要组装的全转录本测序。
这标志着测序技术的第三次革命。

在这里,我们回顾和比较各种长read的方法。

讨论了它们的应用和各自的优缺点,并提出了未来的展望。

关键字
新一代测序 第三代测序  读测序 单分子实时测序 纳米孔测序 合成测序读

重点
长读/第三代测序技术正在引起基因组学的一场新的革命,因为它们提供了一种方法,以前所未有的分辨率研究基因组、转录组和元基因组。

SMRT和纳米孔测序首次允许对不同类型的DNA碱基修饰进行直接研究。

此外,纳米孔技术可以直接对RNA进行测序,识别RNA碱基修饰。

由于MinION的便携性和极其简单的文库制备方法的存在,纳米孔技术首次在现场和偏远地区实现了高通量测序。
这对于调查发展中国家的疫情具有极其重要的意义。

The Advent of Third-Generation Sequencing (TGS)/Long-Read Sequencing

Shortly after the appearance of NGS, TGS technologies emerged. Distinguishing features of TGS are single-molecule sequencing (SMS) and sequencing in real time (as opposed to NGS, where sequencing is paused after each base incorporation) [11]. The first SMS technology, commercialized by Helicos Biosciences, resembled Illumina sequencing but without any bridge amplification [12]. As the method was relatively slow, expensive, and produced short reads (32 bp), it did not prove viable. The first ‘true’ TGS technology was released on the market in 2011 by Pacific Biosciences (PacBio) and is termed ‘single-molecule real-time’ (SMRT) sequencing [13]. More recently (2014), Oxford Nanopore Technologies (ONT) introduced nanopore sequencing [14]. Besides the absence of PCR amplification and the real-time sequencing process, an important feature of SMRT and nanopore sequencing is the production of long reads. As an alternative approach, Illumina introduced a library preparation kit for ‘synthetic long reads’ (SLRs) in 2014 (formerly Moleculo [15]). One year later 10X Genomics introduced a microfluidics variant of SLR with much higher partitioning capacity [16]. Note that SLR technologies are not TGS methods as they are based on classical Illumina sequencing. These long-read technologies are now revolutionizing genomics research as they enable researchers to explore genomes at an unprecedented resolution. In the subsequent sections we examine in more detail these new methodologies. Due to length limitations we do not discuss in detail the analysis of long-read sequence data. Excellent recent reviews focusing on long-read bioinformatics tools can be found elsewhere [17,18]. Long-Read Technologies SMRT Sequencing: PacBio In early 2011, PacBio released their PacBio RS sequencer, which uses SMRT technology (Box 1). While initially average read lengths were relatively short (1.5 kb) and average error rates were high (13%) [19], the technology has strongly improved over recent years. Average read lengths have increased more than tenfold and the throughput per run has increased by about 100-fold owing to the development of improved sequencing chemistries and the release of a new sequencer, the Sequel.
This machine generates about tenfold more sequence data than the upgraded PacBio RS (RSII) and is twofold less expensive (Table 1).
The ‘singlepass’ error rate has remained roughly the same since the beginning (13%), but molecules of up to 1–2 kb can now be sequenced many times owing to the circular templates [20] and increased polymerase processivity, strongly improving overall accuracy (see Figure ID in Box 1).
Moreover, increased throughput has led to a sharp reduction in cost per base ([19];
http://allseq.com/knowledge-bank/sequencing-platforms/pacific-biosciences/).
For genomic DNA library preparation, PacBio commercialized a ‘SMRTbell template prep kit’and an ‘express’ variant thereof for rapid library preparation with an approximately 3-h workflow. For transcriptome analysis an ‘isoform sequencing’ protocol is available (https://www.pacb.com/wp-content/uploads/ Procedure-Checklist-20-kb-Template-PreparationUsing-BluePippin-Size-Selection-System-15-20-kb-Cutoff-Sequel-Systems.pdf).

第三代测序(TGS)/长读测序的出现

在NGS出现后不久,TGS技术就出现了。
TGS的特点是单分子测序(SMS)和实时测序(与NGS不同,NGS在每个碱基掺入后暂停测序)。
第一项SMS技术由Helicos Biosciences公司商业化,类似于Illumina测序,但没有任何桥扩增[12]。
由于这种方法相对缓慢、昂贵,并且产生的读出短(32bp),因此被证明不可行。
2011年,太平洋生物科学公司(PacBio)在市场上发布了第一种“真正的”TGS技术,称为“单分子实时”(SMRT)[13]测序。
最近(2014年),牛津纳米孔技术公司(ONT)引进了[14]纳米孔测序技术。
除了没有PCR扩增和实时测序过程外,SMRT和纳米孔测序的一个重要特征是产生长序列。
作为一种替代方法,Illumina在2014年推出了一种用于“合成长读”(SLRs)的文库准备试剂盒(原Moleculo[15])。
一年后,10X Genomics引入了一个具有更高分割能力的SLR的微流体变体[16]。
请注意,SLR技术不是TGS方法,因为它们是基于经典的Illumina测序。
这些长期研究的技术现在正在彻底改变基因组研究,因为它们使研究人员能够以前所未有的分辨率探索基因组。
在接下来的章节中,我们将更详细地研究这些新方法。
由于长度的限制,我们不详细讨论长读取序列数据的分析。
最近关于长期阅读的生物信息学工具的优秀评论可以在其他地方找到[17,18]。
2011年初,PacBio发布了他们的PacBio RS测序仪,该测序仪使用了SMRT技术(Box 1),虽然最初的平均读取长度相对较短(1.5 kb),[19]的平均错误率较高(13%),但近年来该技术有了很大的进步。
平均读取长度增加了十倍以上,每次运行的吞吐量增加了大约100倍,这是由于改进测序化学的发展和一个新的测序器的发布,续。
这台机器生成的序列数据比升级后的PacBio RS (RSII)多十倍,而且便宜了两倍(表1)。
“单通”的错误率从一开始就大体保持不变(13%),但是由于圆形模板[20]和聚合酶加工能力的提高,现在可以多次测序1 - 2 kb的分子,极大地提高了整体准确性(见方框1中的图ID)。
此外,产量的增加导致了每个碱基成本的急剧下降([19];
http://allseq.com/knowledge-bank/sequencing-platforms/pacific-biosciences/)。
对于基因组DNA文库制备,PacBio商业化了一种“SMRTbell模板准备试剂盒”及其“表达”变体,用于大约3-h的快速文库制备。
对于转录组分析,可以使用“isoform测序”协议(https://www.pacb.com/wp-content/uploads/procedurechecklist-20-kb-template-prepareationusing-bluepippin - size - selec-system15-20-kb-cutoff-sequel-systems.pdf)。

 

Concluding Remarks and Future Perspectives
Over recent years, long-read sequencing methods have strongly improved.

These technologies now enable the study of genomes and transcriptomes at an unprecedented resolution.
Also,metagenomics analyses benefit from long-read sequencing, which allows for the first time the resolution of microbial communities at the species level [68–70].
Long-read sequencing is likely to become a standard medical diagnostic tool in the near future, as exemplified by a recent SMRT sequencing study of a patient’s genome revealing a SV that could not be detected despite extensive genetic testing with other methods [71].
In particular, nanopore sequencing has improved rapidly.
A theoretical 1 coverage of the Escherichia coli genome was obtained with just seven ultralong reads (http://lab.loman.net/2017/03/09/ultrareads-for-nanopore/) and
a human genome has been assembled using nanopore reads alone [24].
Ultralong nanopore reads may allow complete, gapless assembly of human genomes in the near future, which will further boost human genetics research and personalized medicine.

The portability of the MinION allows for the first time sequencing in the field, which is of great importance for the survey of outbreaks in developing countries [72,73].
However, there remains room for improvement. A weakness of nanopore sequencing is the high error rate.
 In 2010, Stoddart et al. proposed the development of nanopores with multiple recognition points for DNA sequence determination [74].

 This would provide a proofreading mechanism improving the overall quality of sequencing.
 As an alternative solution to reduce error rates, a method resembling PacBio CCS has been proposed [26].

On the other hand, to keep up with nanopore technology it will be important for PacBio to increase overall read length and throughput.

Current loading methods depend on passive diffusion and are biased towards shorter fragments.
A novel, voltage-induced loading technique increases the efficiency of loading long DNA molecules [75].

However, it seems unlikely that SMRT sequencing will approach the ultralong reads currently obtained with nanopores, due to the limitation of polymerase processivity.

 Thus, SMRT, nanopore, and SLR sequencing methods each have their particular strengths and weaknesses (Table 2), and depending on the specific application
either one technology or another may be preferred.
It is worth mentioning here that various other companies are also investing in novel methods for rapid, cost-effective, and portable sequencing and it will be interesting to see whether any of these technologies will see light in the near future (see Outstanding Questions).

Last, an exciting possibility of nanopore technology is the sequencing of denatured peptide chains, and recent results confirm its feasibility [76].
 It will be interesting to see whether further progress will be made in the future to make single-molecule protein sequencing a reality.
In any case, we are at only the beginning of the third revolution in sequencing technology and the coming years promise to bring exciting new developments and discoveries.

结束语和未来展望
近年来,长读测序方法有了很大的改进。

这些技术现在使基因组和转录组的研究以前所未有的分辨率成为可能。
此外,宏基因组学分析得益于长时间测序,这首次允许在物种水平上解析微生物群落[68-70]。
在不久的将来,长读测序可能成为一种标准的医学诊断工具,例如,最近一项针对患者基因组的SMRT测序研究显示,尽管使用其他方法进行了大量的基因检测,但仍无法检测到SV[71]。
特别是纳米孔测序技术得到了迅速发展。
理论1 覆盖的大肠杆菌基因组得到只有七超长(http://lab.loman.net/2017/03/09/ultrareads-for-nanopore/)和读取
利用纳米孔读取[24]已组装出人类基因组。
在不久的将来,超长纳米孔可能会实现人类基因组的完全无组织组装,这将进一步推动人类遗传学研究和个性化医疗。

MinION的便携性允许首次在现场进行排序,这对调查发展中国家的疫情非常重要[72,73]。
然而,仍有改进的余地。
纳米孔测序的一个缺点是错误率高。
2010年,Stoddart等人提出开发具有多个识别点的纳米孔用于DNA序列测定[74]。

这将提供一种校对机制,提高测序的整体质量。
作为降低错误率的替代方案,一种类似PacBio CCS的方法已经被提出。

另一方面,为了跟上纳米孔技术的发展,PacBio必须提高整体读取长度和吞吐量。

目前的加载方法依赖于被动扩散,偏向于较短的碎片。
一种新颖的电压诱导加载技术提高了加载长DNA分子的效率[75]。

然而,由于聚合酶加工能力的限制,SMRT测序似乎不太可能达到目前用纳米孔获得的超长序列。

因此,SMRT、nanopore和SLR测序方法各有各自的优缺点(表2),具体应用也有所不同
一种技术或另一种技术都可能是首选。
值得一提的是,其他许多公司也在投资快速、划算、便携测序的新方法,看看这些技术是否会在不久的将来崭露头角,这将是一件很有趣的事情。

最后,纳米孔技术的一个令人兴奋的可能性是对变性肽链进行测序,最近的结果证实了其可行性[76]。
未来是否会取得进一步的进展,使单分子蛋白质测序成为现实,这将是一件有趣的事情。
无论如何,测序技术的第三次革命才刚刚开始,未来几年有望带来令人兴奋的新发展和发现。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
提供的源码资源涵盖了Java应用等多个领域,每个领域都包含了丰富的实例和项目。这些源码都是基于各自平台的最新技术和标准编写,确保了在对应环境下能够无缝运行。同时,源码中配备了详细的注释和文档,帮助用户快速理解代码结构和实现逻辑。 适用人群: 适合毕业设计、课程设计作业。这些源码资源特别适合大学生群体。无论你是计算机相关专业的学生,还是对其他领域编程感兴趣的学生,这些资源都能为你提供宝贵的学习和实践机会。通过学习和运行这些源码,你可以掌握各平台开发的基础知识,提升编程能力和项目实战经验。 使用场景及目标: 在学习阶段,你可以利用这些源码资源进行课程实践、课外项目或毕业设计。通过分析和运行源码,你将深入了解各平台开发的技术细节和最佳实践,逐步培养起自己的项目开发和问题解决能力。此外,在求职或创业过程中,具备跨平台开发能力的大学生将更具竞争力。 其他说明: 为了确保源码资源的可运行性和易用性,特别注意了以下几点:首先,每份源码都提供了详细的运行环境和依赖说明,确保用户能够轻松搭建起开发环境;其次,源码中的注释和文档都非常完善,方便用户快速上手和理解代码;最后,我会定期更新这些源码资源,以适应各平台技术的最新发展和市场需求。 所有源码均经过严格测试,可以直接运行,可以放心下载使用。有任何使用问题欢迎随时与博主沟通,第一时间进行解答!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

wangchuang2017

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值