Chapter16: Artificial Intelligenc, Machine Learning and Deep Learning in Real-Life Drug Design Cases

本文链接：https://blog.csdn.net/weixin_52812620/article/details/126802329

reading notes of《Artificial Intelligence in Drug Design》

文章目录

1.Introduction
2.Application Domains
3.Conclusion and Outlook

1.Introduction

请添加图片描述

2.Application Domains

2.1.Structure-Based Virtual Screening (SBVS)

2.1.1.Development of Scoring Functions: Virtual Screening, Binding Mode Detection, and Binding Affinity

Benchmarking studies in the PDBbind database have shown that RF and boosted regression tree scoring functions outperform classical scoring functions. Another RF scoring function, ΔvinaRF20, outperformed other classical scoring functions in all tests of both CASF-2013 and CASF-2007.
For DL, a convolutional neural network (CNN) scoring function using as input a 3D grid of protein–ligand complexes with features stored at each grid point was introduced by Ragoza et al.
Alternatively, protein–ligand information can be stored as atomic features (atom types, partial charges, etc.) for the context/neighborhood of each atom in the compound. This type of representation is used, for example, by DeepVS.

2.1.2.Enable Docking of Billions of Compounds

DL strategies have also been implemented to circumvent computational limits of large scale docking campaigns. Thus, Deep Docking trains a feed-forward deep neural network (DNN) on docking scores of sets of the library and iteratively removes unfavorable molecules based on molecular descriptors.

2.2.Ligand-Based Virtual Screening (LBVS)

2.2.1.Prediction of Primary Biological Activity

One popular example for the application of this approach using DL is the work from Dahl and Jaitly. They used multitask neural networks (MTNN) to successfully predict active compounds for 19 targets.
More recently, another group well predicted highly and weakly potent kinases inhibitors for more than 100 kinases. The constructed multitask (MT) models showed top median Matthews correlation coefficient (MCC) values exceeding 0.75.
In 2020, Arshadi et al. constructed DeepMalaria. This algorithm is a graph convolutionnal neural network (GCNN) constructed on known Plasmodium Falciparum Dd2 inhibitors combined to a transfer learning step.
In the work of Miljkovic, the authors built ML models to predict the mode of action of noncovalent kinase inhibitors: ATP competitors, inactive form stabilizers, and allosteric modulators.

2.2.2.HTS Analysis

In HTS screening, frequent hitters are unwanted. These are irreversible inhibitors, when covalent compounds are not desired, Panassay interference (PAINS) molecules, spectroscopic interference compounds, aggregators, and promiscuous compounds. Avoiding this kind of compounds, or at least annotating them, is crucial for not spending useless time and efforts with them. Thus, many groups try to identify such compounds in advance.
In 2019, Blaschke et al. explored the possibility to use ML models to distinguish different classes of promiscuous and nonpromiscuous compounds.
More recently, a group investigated the possibility to predict autofluorescent compounds. Following this study, the same group proposed the web tool InterPred for prediction of autofluorescence and luminescence interference.
Figure 2 overviews the different techniques used to screen libraries of compounds, either virtually or experimentally, now complemented with AI, as described in the previous paragraphs.

2.2.3.Library Design

For many decades, prioritizing compounds with desirables properties before any screening is a common task. The most trivial example is the design of libraries containing drug-like compounds by applying the Lipinski’s rule of five.
On this basis, in 2019, a French consortium decided to use ML techniques to design a library focused on PPI inhibitors-like compound.

2.3.Drug Repurposing

2.3.1.Drug-Target Interaction Prediction Using Chemical Information

Unterthiner et al. compared MT-DNN with ECFP12 to seven target prediction methods (SVM, BKD, Logistic Regression, KNN, Pipeline Pilot Bayesian Classifier, Parzen-Rosenblatt, and Similarity Ensemble Approach [SEA]) on the ChEMBL database (1230 targets).
DEEPScreen uses CNN with 2D-images of compounds as input and was used to identify JAK as a target of the drug cladribine.

2.3.2.Drug-Target Interaction Prediction Using Chemical Information and Protein Sequence

Hu et al. developed a CNN classifier using amino acid physicochemical properties (target) and PaDEL-descriptors (drugs). For enzymes, ion channels and G protein-coupled receptors (GPCRs), they achieved accuracies greater than 90%, with superior performance than RF and KNN.

2.3.3.Drug-Therapeutic Use Prediction

DNNs were trained on transcriptomic data from the Library of Integrated Network-Based Cellular Signatures (LINCS) project to predict therapeutic use categories derived from medical subject headings (MeSH).

2.4.Drug Sensitivity

In the last years, a number of data repositories integrating diverse genomic, transcriptomic and proteomic information of cell lines and its sensitivity to drugs have become available: Genomics in Drug Sensitivity in Cancer (GDSC), COSMIC Cell Lines Project (CCLP), Cancer Cell line Encyclopedia (CCLE) and the NCI-60 panel.
In a pioneer work by Menden et al. chemical information on the physicochemical properties and fingerprint features of 111 drugs was integrated for the first time with the genomic profile of 608 tumoral cell lines (GDSC, 38,930 dose–response curves for compound-cell line couples, 58% matrix cover- age) using feed-forward neural networks (FNN) and RF.
Another study on the NCI-60 panel, with a more restricted number of cell lines but a higher number of compounds, integrated cell line profiling data with circular Morgan descriptors using RF and conformal prediction (941,831 points, 93% coverage, R²=0.83 test set).
DL models have also been explored to predict drug effectiveness. CDRscan uses the mutational status of genomic positions and molecular fingerprints.

2.5.De Novo Design

Segler et al. used LSTM RNNs with transfer learning to build target-based libraries and retrospectively generated 18% and 28% of true actives for two antibiotic targets.
Generative models using transcriptomic data design compounds with higher similarity to active compounds than those identified based on comparisons of expression signatures.

2.6.Reactions and Synthetic Accessibility

The very first attempt in providing a computational retrosynthesis tool was proposed by Pensak and Corey in 1969. As a part of the LHASA software, it is based on a rule-based method and assists the chemist by helping him find a suitable precursor to a target molecule.
In order to find the most probable synthetic routes, many authors suggested the use of a synthetic accessibility (SA) score. In 2018, Coley et al. proposed the synthetic complexity score (SCScores).
In 2017, Segler et al. published a first paper demonstrating the benefit of applying a neural network model to predict the most probable reaction rule to be applied to a molecule. The proposed methodology outperformed both the logistic regression model and the rule-based expert system tested in parallel. This work has been updated with a second paper and the building of a DL network associated with a Monte Carlo tree search. On blind evaluation, experimental chemists did not show any preference between the synthetic routes proposed by the algorithm and the literature synthetic routes.
In 2018, Fooshee et al. introduced Reaction Predictor.
In 2019, Molecular Transformer by Schwaller et al. showed better performances compared to known literature tools for reaction prediction.
Other algorithms are going deeper by predicting directly products and also reaction conditions.
There are also attempts to predict the yield of specific classes of reactions.
Many other recent articles not reviewed in this section can be found in the literature.

2.7.ADME[T] Prediction

2.7.1.Solubility

The first solubility model was proposed by Irmann in 1965 as a group contribution approach and soon after updated by Hansch.
These important works led Ran and Yalkowski to develop the now popular general solubility model (GSE) for compounds in water using only two experimentally determined descriptors, the melting point and the octanol water partition coefficient.
In the 2008 solubility challenge, the best model (ANN) achieved an RMSE of 0.99 log and R² of 0.71, which is comparable to the best human predictions. Recently, a new solubility challenge was published.
Despite all the publications mentioned above showing DL methods reach better performances compared to other methods, there is still interest in developing more conventional models. For example, Avdeef developed RF regression models and was able to extract important features used by the models for solubility prediction.
Since some years, we also see an interest in predicting protein solubility.

2.7.2.Lipophilicity

There are many works about prediction of lipophilicity, such as Wu et al., Montanari et al, Li and Fourches, Fuchs et al…

2.7.3.Metabolism

We denote three main axes for metabolism prediction.
- The first one (1) is the prediction of the metabolic reactivity of a compound. In 2019, Jan Wenzel et al. constructed DNN models to predict the metabolic liability of compounds. In other studies, authors proposed models that predict if a compound will be metabolized by specific isoenzymes.
- The second axis (2) is the prediction of the sites of metabolism (SOMs). In those studies, the authors’ goal is to predict which atom(s) will undergo a metabolic transformation. There are MetScore trained on BIOVIA Metabbolite Database and FAME3.

2.7.4.Toxicity

Human Ether-a`-go-go-Related Gene (hERG) cardiotoxicity is a main concern and many models are published.
- Recently, Korolev et al. built GCNNs to predict multiple activities and properties of compounds.
- The same year, Ogura et al. constructed a dataset containing 9889 hERG inhibitors and 281,313 inactive compounds on hERG. Then, NN, DNN, RF, SVM, and linear discrimination models were constructed using selected descriptors issued from ECFP4, Pipeline Pilot, and MOE 2D/3D descriptors. SVM model exhibited the best performance on external set (Kappa equal to 0.749) and outperformed the DNN model constructed in this study and also other well-known commercial software.
Prediction of hepatotoxicity is also quite challenging as many different mechanisms can lead a compound to induce liver injury.
- Wang et al.predicted three endpoints known to induce liver injury from transcriptomic responses.
- In 2020, Nguyen-Vo et al. constructed a CNN model using molecular descriptors to predict successfully hepatotoxic compounds.
- In 2017, Xu et al. constructed regression and classification models for acute oral toxicity prediction . For this, they built ECFP-based CNN models on a training set of 8080 molecules and a validation set of 2045 compounds.
- In 2019, Sosnin et al. conducted a comparative study for the prediction of acute oral toxicity for different species and different modes of administration. In this article, traditional ML methods, STNN, and MTNN were constructed with the use of many different types of descriptors. It appeared that MTNN gave the best results with an average RMSE of 0.71 for all endpoints.
Without going into details, many other different toxic end- points are studied, such as mutagenicity , skin/eye irritation and corrosion, endocrine disrupting chemicals, reactive metabolite and many others.

2.7.5.Other Endpoints

Montanari et al. predicted the membrane affinity, human serum albumin binding, and melting point by using MTNN GC with reasonable performances (R²=0.51–0.65).
Watanabe et al. predicted human renal excretion using traditional ML methods.
In silico prediction of compounds binding to human plasma protein was proposed by Sun et al. in 2018.
Another interesting article proposed to use molecular fingerprints derived from molecular dynamic simulations to predict P-gp substrates.
Many other endpoints related to ADME have been predicted, like Caco-2 permeability , human intestinal absorption, organic cation transporter protein 2 inhibitor, P-gp inhibitor, and many others.

2.8.Quantum Mechanics (QM)

Sets of compounds for which QM simulations are available are used to train NN, and predictions are then made for new molecules not included in the training set. Once trained, the NN predictions are many orders of magnitude faster than the QM calculations.

2.8.1.Total energies

In 2017, Schutt et al. developed a deep tensor neural network (DTNN) architecture in which the input molecular structures are coded by a vector of nuclear charges and a matrix of atomic distances, and the output is the total energy of the molecules. They employed two subsets of the GDB-13 database: the GDB-7 data set including >7000 molecules with up to seven heavy atoms (C, N, O, F), and GDB-9 data set containing >133K molecules with up to nine heavy atoms.
In 2017, Smith and colleagues have built ANI-1, a transferable neural network potential. It was trained on 56 K small molecules with up to 8 heavy atoms (C, N, O) from the GDB-11 database, for which DFT energies were computed.

2.8.2.Conformational Exploration

The search for equilibrium conformers of molecules was performed by Gebauer et al. which used an autoregressive convolutional deep neural network architecture, adapted from SchNet. They generated molecule structures in 3D with a root.

2.8.3.Partial Charges

Bleiziffer et al. used an ML approach for predicting partial charges for organic molecules for use in classical molecular dynam- ics simulations and as descriptors in QSAR/QSPR models.