Paper reading (八十七):Human microbiome aging clocks based on DL

论文题目:Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects

scholar 引用:3

页数:16

发表时间:2019.01 This article is a preprint and has not been certified by peer review

作者:Fedor Galkin, Aleksandr Aliper, ..., Alex Zhavoronkov

摘要:

The human gut microbiome is a complex ecosystem that both affects and is affected by its host status. Previous analyses of gut microflora revealed associations between specific microbes and host health and disease status, genotype and diet. Here, we developed a method of predicting biological age of the host based on the microbiological profiles of gut microbiota using a curated dataset of 1,165 healthy individuals (3,663 microbiome samples). Our predictive model, a human microbiome clock, has an architecture of a deep neural network and achieves the accuracy of 3.94 years mean absolute error in cross-validation. The performance of the deep microbiome clock was also evaluated on several additional populations. We further introduce a platform for biological interpretation of individual microbial features used in age models, which relies on permutation feature importance and accumulated local effects. This approach has allowed us to define two lists of 95 intestinal biomarkers of human aging. We further show that this list can be reduced to 39 taxa that convey the most information on their host’s aging. Overall, we show that (a) microbiological profiles can be used to predict human age; and (b) microbial features selected by models are age-related.

正文组织架构:

1. Introduction

2. Methods

2.1 Data acquisition

2.2 Abundance calculation

2.3 Neural networks training

2.4 Oversampling

2.5 Feature importance

3. Results

3.1 Age prediction using machine learning

3.2 Microbiological influence on age prediction

3.3 Age bracket prediction with DNN

3.4 Host-based age prediction

4. Discussion

5. Conclusion

正文部分内容摘录:

1. Biological Problem: What biological problems have been solved in this paper?

  • predicting biological age

2. Main discoveries: What is the main discoveries in this paper?

  • Our predictive model, a human microbiome clock, has an architecture of a deep neural network and achieves the accuracy of 3.94 years mean absolute error in cross-validation.
  • This approach has allowed us to define two lists of 95 intestinal biomarkers of human aging.We further show that this list can be reduced to 39 taxa that convey the most information on their host’s aging.
  • Although surprising at first glance, bacterial influence on age prediction is not determined by whether it is beneficial to the host or not. 

3. ML(Machine Learning) Methods: What are the ML methods applied in this paper?

  • we developed a method of predicting biological age of the host based on the microbiological profiles of gut microbiota using a curated dataset of 1,165 healthy individuals (3,663 microbiome samples).
  • We also developed a method for microbiological feature selection and annotation. It combines two-fold feature importance assessment using PFI and ALE approaches upon training a DNN. 
  • We applied multiple methods to build a regressor that takes in profiles containing abundances for all 1,673 taxa reliably detected in at least 0.13% of samples, including random forest, support vector machine, elastic net, gradient boosting (XGB) and deep neural network (DNN). However, only the latter two models achieved the predictions better than random 
  •  build a predictor of age with whole genome sequencing (WGS) data aggregated from multiple sources and various machine learning techniques and use it to examine patterns of incessant microflora succession. 
  • we report a method to estimate a host’s age based on their microflora taxonomic profile, assess the importance of specific taxa in organismal aging, and suggest candidate geroprotective microbiological interventions.
  • The best performing model architecture was determined in the sample-based setting. It contains three hidden layers with 512 nodes in each, with PReLU activation function, Adam optimizer, dropout fraction 0.5 at each layer, and 0.001 learning rate
  • Age classifier models were trained using a subset of either 95 features or 39 features. 

4. ML Advantages: Why are these ML methods better than the traditional methods in these biological problems?

  • To verify the results obtained with DNN, we implemented random forest, support vector machine and elastic net regressor. 
  • All of these methods performed poorly compared to the DNN approach with the mean absolute errors exceeding 11 years.
  • Apart from them, we trained a gradient boosting (XGB) regressor with accuracy comparable to the DNN model (MAE = 4.69 years, R2 = 0.81) 
  • According to Permutation Feature Importance (PFI) scores, DNN regressor is more sensitive to highly abundant species, while XGB regressor contains some minor taxa among its most important features. We consider this an indication of DNN’s increased robustness compared to other methods.

5. Biological Significance: What is the biological significance of these ML methods’ results?

  • We further introduce a platform for biological interpretation of individual microbial features used in age models, which relies on permutation feature importance and accumulated local effects.
  • Despite great performance of XGB (MAE = 4.69 years) and DNN models (MAE = 3.94 years), extracting biologically relevant information from them presents a major challenge.
  • We implemented ALEs approach using DNN regressor as a reference and its 95 most important features to see how changes in abundance affect the predictions. ALE is a technique that theoretically surpasses PFI as it takes into account intrinsic interdependence of microbiological features. 
  • According to our ALE analysis, only 39/95 features could change the average predicted age by more than 1 year
  • Interestingly, reducing the number of features by 59% caused only a 5% drop in F-score for the age bracket classification task. This suggests that the ALEs technique succeeded in selecting only the most relevant microbial features.
  • A weighted F1-score was selected as the target metric to assess model performance.

6. Prospect: What are the potential applications of these machine learning methods in biological science?

  • To our best knowledge, we present the first method to predict human chronological age using gut microbiota abundance profiles. 
  • Overall, we show that (a) microbiological profiles can be used to predict human age; and (b) microbial features selected by models are age-related.
  • The identified biomarkers include species whose abundance is positively or negatively correlated with predicted age. These species may be further investigated deeply by the community to improve our understanding of human aging and its relationship with the gut microbiome.

7. Mine Question(Optional)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值