Paper reading (六十四):ML Leveraging Genomes Identifies Influential Antibiotic Resistance Genes

论文题目:Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome

scholar 引用:9

页数:12

发表时间:2018.01

发表刊物:American Society for Microbiology Journals

作者:Sumayah F. Rahmana, Matthew R. Olma, ...,Jillian F. Banfieldc

摘要:

Antibiotic resistance in pathogens is extensively studied, and yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leveraged genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We found that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants than C. difficile strains lacking this gene. Organisms with genes for major facilitator superfamily drug efflux pumps have higher replication rates under all conditions, even in the absence of antibiotic therapy. Using a machine learning approach, we identified genes that are predictive of an organism’s direction of change in relative abundance after administration of vancomycin and cephalosporin antibiotics. The most accurate results were obtained by reducing annotated genomic data to five principal components classified by boosted decision trees. Among the genes involved in predicting whether an organism increased in relative abundance after treatment are those that encode subclass B2 beta-lactamases and transcriptional regulators of vancomycin resistance. This demonstrates that machine learning applied to genome-resolved metagenomics data can identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration.

IMPORTANCE 

The process of reconstructing genomes from environmental sequence data (genome-resolved metagenomics) allows unique insight into microbial systems. We apply this technique to investigate how the antibiotic resistance genes of bacteria affect their ability to flourish in the gut under various conditions. Our analysis reveals that strain-level selection in formula-fed infants drives enrichment of beta-lactamase genes in the gut resistome. Using genomes from metagenomes, we built a machine learning model to predict how organisms in the gut microbial community respond to perturbation by antibiotics. This may eventually have clinical applications.

正文组织架构:

1. Introduction

2. Results and discussion

2.1 Antibiotic resistance of the premature infant microbiome 早产儿菌群的耐药性

2.2 Formula feeding influences the gut resistome through strain-level selection

2.3 Major facilitator superfamily (MFS) pumps are associated with increased replication

2.4 A model that predicts an organism’s response to vancomycin and cephalosporins

3. Materials and methods

3.1 Sample collection, sequencing, assembly, and gene prediction

3.2 Genome recovery and calculation of relative abundances

3.3 iRep calculation

3.3 Annotation

3.4 Statistical and computational analysis

3.5 Data availability

正文部分内容摘录:

1. Biological Problem: What biological problems have been solved in this paper?

  • predicting whether an organism increased in relative abundance after treatment
  • identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration.
  • classify resistomes as belonging to either a formula-fed baby or a breast-fed baby

2. Main discoveries: What is the main discoveries in this paper?

  • Our analysis reveals that strain-level selection in formula-fed infants drives enrichment of beta-lactamase genes in the gut resistome.
  • Using genomes from metagenomes, we built a machine learning model to predict how organisms in the gut microbial community respond to perturbation by antibiotics.
  • we used genome-resolved metagenomics coupled with statistical and machine learning approaches to investigate the gut resistome of 107 longitudinally sampled premature infants.
  • We show that certain antibiotic resistance genes in particular genomes affect how clinical factors influence the gut microbiome and, in turn, how the antibiotic resistance capabilities of a gut organism influence its growth and relative abundance.

3. ML(Machine Learning) Methods: What are the ML methods applied in this paper?

  • Random forest models were used to classify resistomes as belonging to either a formula-fed baby or a breast-fed baby, and we used the feature importance scores of the trained models to select resistance genes for further study
  • Principal-component analysis (PCA) was performed on Resfams and KEGG annotations to generate a low-dimensional representation of each organism’s metabolic potential and resistance potential. The first five principal components (PCs) cumulatively explained 48% of the variation in the data set.
  • Using these PCs as input, the AdaBoost-SAMME algorithm was applied, with decision tree classifiers as base estimators. The model, trained on 70% of the data, performed extremely well on the validation set, with a precision value of 1.0 and a recall value of 1.0, indicating that every genome was correctly classified. Because the validation set was utilized for testing during the preliminary stages of model development, the model was also evaluated with a final test set, with which it achieved 0.9 precision and 0.7 recall.
  • The dataset used was comprised of 597 previously reported samples (55–57) and 305 new samples. These samples are available at NCBI under accession number SRP114966. The code for the analysis, along with all the data and metadata used in the analysis, is hosted at https://github.com/SumayahR/antibiotic-resistance.

4. ML Advantages: Why are these ML methods better than the traditional methods in these biological problems?

  • Previous studies have utilized data from 16S rRNA gene amplicon sequencing or read-based metagenomics of the human microbiome to predict life events and disease states of the human host using machine learning or other modeling techniques.
  • 当前问题的难点:read-based metagenomics lacks resolution at the genomic level, and, due to strain-level differences in antibiotic resistance , taxonomy data from marker gene studies cannot be used to predict how particular organisms in a community will respond to antibiotics.
  • Using scikit-learn, development of a machine learning model to predict the direction of change in relative abundance for each genome based on its Resfams and KEGG metabolism data was attempted, and yet an adequate model could not be developed, presumably due to variations in the ways in which organisms respond to different antibiotic combinations. 

5. Biological Significance: What is the biological significance of these ML methods’ results?

  • Mann-Whitney U tests were performed on Resfams genes that had feature importance scores above 0.07 in the random forest model, as calculated by the Gini importance metric.
  • the model that exhibited the best results with regard to precision and recall was selected.

6. Prospect: What are the potential applications of these machine learning methods in biological science?

  • This may eventually have clinical applications.
  • This has tremendous potential for application in the fields of medicine and microbial ecology.
  • For example, such a model can be used before administering drugs to a patient to verify that a particular combination of antibiotics will not lead to overgrowth of an undesirable microbe.
  • Our report serves as a proof of concept for this application of machine learning used in conjunction with genome-resolved metagenomics to derive biological insight.

7. Mine Question(Optional)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值