Paper reading (十):Next-generation Machine Learning for Biological Networks

论文题目:Next-generation Machine Learning for Biological Networks

scholar 引用:90

页数:12

发表时间:2018.05

发表刊物:Cell

作者:Diogo M. Camacho, Katherine M. Collins, Rani K. Powers, James C. Costello and James J. Collins

摘要:Keywords: Machine learning, deep learning, systems biology, synthetic biology, network biology, neural network

Machine learning, a collection of data-analysis techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a prime on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthentic biology.

结论:

  • the need for massively large datasets
  • Although data captured from biological systems can be incredibly complex, the majority of these datasets are orders of magnitude too small for deep learning algorithms to be applied appropriately.
  • options for above challenge:
  1. invest in the collection of suitably large, well-annotated datasets for state-of-the-art studies in network biology.
  2. generate in silico data with properties of real data (GAN)
  • black box nature of most next-generation machine learning models

Introduction:

  • Applications of machine learning in biology:
  1. genome annotation
  2. predictions of protein binding
  3. the identification of key transcriptional drivers of cancer
  4. predictions of metabolic functions in complex microbial communities
  5. the charaterization of transcriptional regulatory network
  6. and so on...
  • A key advantage is that machine-learning methods can sift through volumes of data to find patterns that would be missed otherwise.
  • Network biology involves the study of the complex interactions of biomolecules that contribute to the structures and functions of living cells.

正文组织架构:

1. Introduction

2. A primer on Machine Learning

  • Basic of Machine Learning
  • Categories of Machine-Learning Methods
  • Applying Machine Learning in Biological Contexts
  • Deep Learning: Next-Generation Machine Learning

3. Intersection of Machine Leraning and Network Biology

  • Disease Biology
  • Drug Discovery
  • Microbiome Research
  • Synthetic Biology
  • Challenges and Future Outlook

正文部分内容摘录:

  •  GANs are deep neural network architectures comprised of two neural networks that are pitted against each other—one is a generative model that produces new data that mimic the distributions of the training dataset, while the other is a discriminative model (the adversary) that evaluates the new data and determines whether or not it belongs to the actual training dataset.
  • In biological applications, features can include one or more types of data, such as gene expression profiles, a genomic sequence, protein-protein interactions, metabolite concentrations, or copy number alterations.
  • Overfitting and underfitting are major causative factors underlying poor performance of machine-learning approaches. 
  • The old computer-science adage of “garbage in, garbage out” was never truer than it is with machine-learning applications. 这个谚语,扎心了。。。
  • feature selection:refer the reader to several excellent articles (Chandrashekar and Sahin, 2014Domingos, 2012Guyon and Elisseeff, 2003Little and Rubin, 1987Saeys et al., 2007). 值得一看哦
  • Unsupervised techniques can be used in a case where the sample labels are missing or incorrect. 
  • Reverse Engineering Assessment and Methodology (DREAM) 
  • Each DREAM challenge presents the network biology research community with a specific question and the necessary data to address it. 
  • rules of thumb
  1. Simple is often better
  2. Prior knowledge improves performance
  3. Ensemble models produce robust results
  • A key drawback of the deep learning paradigm is that training a deep neural network requires massive datasets of a size often not be attainable in many biological studies.
  • capsule networks allow for the learning of data structures in a manner that preserves hierarchical aspects of the data itself. 
  • Capsule networks are ripe for application in network biology and disease biology given that biological networks are highly modular in nature, with specified layers for the many biomolecules, while allowing each of these layers to interact with other layers. 
  • It is exciting to consider how multi-task learners could be used to bridge the gap between the biological and chemical aspects of drug discovery by incorporating structural data on chemical entities.
  • The human microbiome consists of the microorganisms—bacteria, archaeaviruses, fungi, protozoa—that live on or inside the human body.
  • transfer learning ----the immutable nature of biochemical compounds
  •  a learning model that embeds biological sequences to ones that embed regulatory motifs and circuit structures
  • The generated deep learning model could be used to identify fundamental design principles for synthetic biology. 
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值