Paper reading (十一):Deep Learning in Biomedical Data Science

论文题目:Deep Learning in Biomedical Data Science

scholar 引用:16

页数:27

发表时间:2018.07

发表刊物:Annual Review of Biomedical Data Science

作者:Pierre Baldi

摘要:Since the 1980s, deep learning and biomedical data have been coevolving and feeding each other. The breadth, complexity, and rapidly expanding size of biomedical data have stimulated the development of novel deep learning methods, and application of these methods to biomedical data have led to scientific discoveries and practical solutions. This overview provides technical and historical pointers to the field, and surveys current applications of deep learning to biomedical data organized around five subareas, roughly of increasing spatial scale: chemoinformatics, proteomics, genomics and transcriptomics, biomedical imaging, and health care. The black box problem of deep learning methods is also briefly discussed.

结论:

  • the sense of insecurity that arises from not knowing how a neural network solves a particular task.
  •  The significance of this issue is compounded by recent developments in a variety of adversarial methods that can fool neural networks.
  • The training examples cannot be recovered from the weights in a fundamental way. 
  • the practical solution to address the problem is to build a modular system where particular behaviors or errors can easily be isolated and corrected.  比如说,一个模型的功能是辨别出有息肉的图片,如果我们让它不仅辨别出图片,并且在息肉区域画一个框,那么当我们看到模型做出假阳性的判断时,根据框中的内容,就能分析为什么会犯错了以及如何修正。
  • most of the objects we use and trust everyday are black boxes.

Introduction:

  • (a) big data and (b) computing power
  • providing powerful methods for analyzing biomedical data
  • providing simplified but useful computational models for neuroscience

正文组织架构:

1. Introduction

2. Biomedical data

3. Architectures and algorithms

3.1. Dealing with variable-size structured data

3.2. Inner and outer Recursive Neural Network Approaches

4. Deep learning in chemoinformatics

4.1. Molecules

4.2. Reactions

5. Deep learning in proteomics

5.1. Protein structures

5.2. Protein secondary structure and other structural features

5.3. Protein contacts and contact maps

5.4. Protein functional features

6. Deep learning in genomics and transcriptomics

7. Deep learning in biomedical imaging

8. Deep learning in health care

9. Conclusion: the black box question

正文部分内容摘录:

2. Biomedical data

  • Biomedical data:small molecules to omic data (e.g., genomic, proteomic, transcriptomic, metabolomic), biomedical imaging data, clinical data, and electronic medical records. 
  • the data types:digital,text,complex associated structures such as sequences, trees, and other graphs
  • data are much less in the chemical, pharmaceutical, or clinical sciences:the numerous commercial, legal, and other societal barriers
  • data landscape:complexity and variability

3. Architectures and algorithms

  • ther forms of deep learning exist based on graphical probabilistic models (e.g., deep Bayesian networks, Boltzmann machines) and that hidden Markov models (HMMs) already implement a form of deep learning since the transitions between hidden states are not observed. 
  • different forms of deep learning can be combined

3.1. Dealing with variable-size structured data

  • small molecules, nucleotide or amino acid sequences, protein or other contact maps, phylogenetic trees, natural language sequences, natural language parse trees.
  •  for variable-size structured data, a recursive network must be used
  • designing recursive neural networks: the inner approach and the outer approach

3.2. Inner and outer Recursive Neural Network Approaches (看的不是很明白。。。看起来就是RNN类似?)

  • the inner approach uses two recursive neural networks, one for the transitions and one for the emissions 
  • The approach is called inner because the neural networks are used to crawl the graphs associated with the data from the inside. 
  • the inner and outer approaches are not exclusive and can be combined.

4. Deep learning in chemoinformatics

4.1. Molecules

  •  publicly available data sets of small molecules annotated with some of their properties tend to be fairly rare and small
  •  the variety of representations to represent molecules, Figure3 挺直观的
  • deep learning methods applied to facilitate quantum and molecular mechanics calculations are still far from being able to scale up to chemical space.

4.2. Reactions

  • SMIRKS strings are used to represent reactions
  • given a set of reactants
  • uses a siamese network to compare source–sink pairs and identify the most favorable ones
  • recursive networks
  •  the main challenges is again to find suitable training data

5. Deep learning in proteomics

5.1. Protein structures

  • NMR (nuclear magnetic resonance) 
  • first step: prediction of structural features such as secondary structure and relative solvent accessibility; the second step: prediction of coarse- and fine- grained contact maps invariant to translations and rotations; the third step: prediction of backbone and side chain 3D coordinates.
  • deep learning methods must focus on the first and second steps of the pipeline.

5.2. Protein secondary structure and other structural features

  • using evolutionary profiles in the input
  •  two interesting technical challenges:  (a) to predict secondary structure and other structural features with an accuracy of about 80% or more, using no similarity to known proteins at all (i.e., no input profiles); (b) to predict secondary structure and other structural features with an accuracy of 85% or more, using sequence similarity alone (profiles), but no structural similarity.

5.3. Protein contacts and contact maps

  • the probelm in protein contact and contact maps is harder and still unsolved
  • there is also a fairly extensive body of work using similar shallow and deep learning methods to predict other features related to structure
  • This is also an area where other forms of deep learning, in terms of deep graphical models, have been used.

5.4. Protein functional features​​​​​​​

  •  inner and outer deep learning methods have also been applied to various problems in computational immunology

6. Deep learning in genomics and transcriptomics

  • Current modern applications of deep learning in genomics are focusing on the analysis of actual DNA or RNA sequences and the inference of functional properties and phenotypic consequences associated with mutations.

7. Deep learning in biomedical imaging

  • transfer learning
  • These pretrained architectures usually have many parameters; however, the danger of overfitting is limited because they have already been trained with many images.
  • model compression
  • classification, localization and segmentation
  • an explosion of applications in the coming years in this area

8. Deep learning in health care

  • One important challenge in the latter area is how to incorporate temporal information
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值