Paper reading (二十二)：Integrated omics: tools, advances and future approaches

本文链接：https://blog.csdn.net/wxw060709/article/details/101704913

论文题目：Integrated omics: tools, advances and future approaches

scholar 引用：12

页数：25

发表时间：2018.07

发表刊物：Journal of Molecular Endocrinology

作者：Biswapriya B Misra, Carl Langefeld, Michael Olivier and Laura A Cox

摘要：Key Words: integrated, omics, genomics, transcriptomics, proteomics, metabolomics, network, statistics, Bayesian, machine learning, principal component analysis, correlation, clustering

With the rapid adoption of high-throughput omic approaches to analyze biological samples such as genomics, transcriptomics, proteomics and metabolomics, each analysis can generate tera- to peta-byte sized data files on a d aily basis. These data file
sizes, together with differences in nomenclature(命名法) among these data types, make the integration of these multi-dimensional omics data into biologically meaningful context challenging. Variously named as integrated omics, multi-omics, poly-omics, trans-omics, pan-omics or shortened to just ‘omics’, the challenges include differences in data cleaning, normalization, biomolecule identification, data dimensionality reduction, biological contextualization, statistical validation, data storage and handling, sharing and data archiving. The ultimate goal is toward the holistic realization of a ‘systems biology’ understanding of the biological question. Commonly used approaches are currently limited by the 3 i’s – integration, interpretation and insights. Post integration, these very large datasets aim to yield unprecedented views of cellular systems at exquisite resolution for transformative insights into processes, events and diseases through various computational and informatics frameworks. With the continued reduction in costs and processing time for sample analyses, and increasing types of omics datasets generated such as glycomics(糖组学), lipidomics(脂类组学), microbiomics and phenomics(表型组学), an increasing number of scientists in this interdisciplinary domain of bioinformatics face these challenges. We discuss recent approaches, existing tools and potential caveats in the integration of omics datasets for development of standardized analytical pipelines that could be adopted by the global omics research community.

结论：

no single approach exists for processing, analyzing and interpreting all data from different -omes.
the community needs to embrace challenges posed from these complex datasets to standardize sample quality, sample analysis pipelines, data analysis pipelines and data formats for public data availability.
Integrated omics is not just a collage of tools, but a cohesive paradigm for insightful biological interpretation of multi-omics datasets that will potentially reveal novel insights into basic biology, as well as health and disease.

Introduction：

Access to large-scale omics datasets has revolutionized biology and led to the emergence of systems approaches to advance our understanding of biological processes.
These multilayered, multifactorial approaches are computationally challenging and difficult to display and comprehend
visually.
Broad experimental challenges in these integrated omics approaches include, but are not limited to：

understanding the statistical behavior of readouts from each omics regime independently
recognizing non-obvious relationships that exist between omics regimes within their original biological context
capitalizing on time resolution in omics data

正文组织架构：

1. Introduction

2. Strengths and challenges of individual omics

2.1 Genomics and transcriptomics

2.2 Proteomics

2.3 Metabolomics

2.4 Unique challenges to specific omics platforms

2.4.1 Linking genotype to phenotype

2.4.2 Quantification of the proteome

2.4.3 Quantification of the metabolome

2.5 Issues shared among the omics platforms

2.5.1 Data handling

2.5.2 Annotation

2.5.3 Study design and analytic assumptions

2.5.4 Statistical power

2.5.5 Data archiving and sharing

3. Tools available for integration of multi-omics data

4. Recent examples of integration in real world datasets

4.1 Complex diseases

4.2 Immunity and infection

4.3 Cancer

4.4 Host microbiome interactions

5. Statistical approaches for current challenges

5.1 Number of samples vs number of molecules

5.2 Dimension reduction

5.3 Data integration

6. Current challenges and looking to future

6.1 Experimental challenges

6.1.1 Challenges in sample preparation

6.1.2 Optimizing, documenting and sharing workflows

6.1.3 Data processing

6.1.4 Time course studies

6.2 Individual omics datasets – normalization, transformation of different omics data types

6.3 Integration issues – data scaling, false positives and unknowns

6.4 Data issues – data archiving and sharing

6.5 Hurdles in implementing multi-omics approaches in the clinic for diagnostic/prognostic purposes

6.6 Biological knowledge – data interpretation

7. Conclusions

正文部分内容摘录：