Benchmarking Hybrid Correction and Assembly Using Short Illumina Reads and Long PacBio Reads
Ma Dongna;Zhang Xingtan;Wei Liufeng;Li Yiying;Zhong Weimin;Zhao Qian;YouMinsheng;State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Agriculture and Forestry University;Institute of Applied Ecology, Fujian Agriculture and Forestry University;Key Laboratory of Integrated Pest Management for Fujian-Taiwan Crops, Ministry of Agriculture;Key Laboratory of Green Control of Insect Pests (Fujian Agriculture and Forestry University), Fujian Province University;Center for Genomic and Biotechnology, Fujian Agriculture and Forestry University;
The Pacific Bio Sciences(Pac Bio) sequencing platform, a single molecular sequencing technology(Iso-Seq) offers great improvement over current sequencing technologies because of its high throughput, fast speed,and longer reads. However, compared with the second generation sequencing(SGS) technologies, the error rate of Pac Bio is fairly higher, leading to a lower accuracy of assembly. In order to reduce errors and improve the assembly, we generated the simulated data from Arabidopsis thaliana to compare and evaluate the results of corrected rate and hybrid assembly based on different software as well as different read depths, which would optimize the hybrid assembly result through improve the assembly tragedy including sequencing depth and software. Using different correction software of PBc R, Lo RDEC, Jabba and Proovread, we found that Lo RDEC increases the accuracy of nucleic acids from 85% to 99% at the fastest speed, generated a better correction. Then,we simulated different read depths with 20×~100× for SGS and 20×~50× for Iso-Seq to further test the assemble quality. We used different assemble software including three hybrid assembly software and two third generation assembly software to evaluate the assembly. We found that the hybrid assembly software DBG2 OLC and the third generation assembly software canu produced better assembly and gene completeness.
马东娜;张兴坛;魏留峰;李以英;钟伟民;赵倩;游民生;福建农林大学闽台作物生态病虫害防治国家重点实验室;福建农林大学应用生态研究所;农业部闽台作物病虫害综合治理重点实验室;福建省大学病虫害绿色防治重点实验室(福建农林大学);福建农林大学基因组与生物技术中心;