ai人工智能_药物发现中的人工智能-CSDN博客

ai人工智能

‘What Drug should I make next?’ and ‘How can I make it?

“我接下来应该制造什么药物？” 和“我该怎么做？

Historically, drugs were discovered either by chance (Serendipity) or through traditional remedies. In the more recent times, this has evolved into the process of Modern Drug Discovery and Development. The Modern Drug Discovery studies are funded in millions of dollars and involve extensive screening of potential drug candidates before they are brought into the market for consumption.

从历史上看，药物是偶然发现的(偶然)或通过传统疗法发现的。在最近的时间里，这已经演变成现代药物发现和开发的过程。现代药物发现研究获得了数百万美元的资助，涉及对潜在候选药物的广泛筛选，然后再将其投放市场进行消费。

Our body is almost entirely made of proteins and all the machinery uses enzymes (proteins) to function. The disease-causing pathogens also use various enzymes for their own functioning but unfortunately, some of these proteins are also present in the human body which can help these pathogens survive. To discover a new drug, researchers begin with carefully identifying a target protein that is helping the survival of the pathogen and causing a disease. If the target protein is also employed in the human body, stopping it could cause side-effects or even lead to fatality. Once the protein is identified, researchers need to find a compound (potential drug candidate) that could alter this target protein.

我们的身体几乎完全由蛋白质组成，所有机械都使用酶(蛋白质)发挥作用。致病性病原体还利用各种酶来发挥自身的功能，但不幸的是，这些蛋白质中的一些也存在于人体中，可以帮助这些病原体生存。为了发现一种新药，研究人员首先仔细地确定一种有助于病原体存活并引起疾病的靶蛋白。如果目标蛋白质也用于人体，则停止该蛋白质可能会导致副作用，甚至导致死亡。一旦鉴定出蛋白质，研究人员就需要找到可以改变该目标蛋白质的化合物(潜在的候选药物)。

Scientific groups look at thousands of compounds and a ‘lead’ compound is selected after extensive screening, bioavailability, and toxicity studies. If the lead compound seems promising, preclinical trials test this new drug candidate on non-human subjects to assess its safety and efficacy.

科学小组研究了数千种化合物，经过广泛的筛选，生物利用度和毒性研究后，才选择了“先导”化合物。如果先导化合物看起来很有希望，则临床前试验会在非人类受试者上测试该新药候选药物，以评估其安全性和有效性。

Once preclinical research is complete, researchers move on to clinical drug development which includes clinical trials and volunteer studies to fine-tune the drug for human use. Once the new drug has been formulated for its best efficacy and safety, and the results from the Phase III clinical trials are available, it is advanced forward for holistic FDA review. Post drug approval and market launch, the FDA requires drug companies to continuously monitor the safety of their drugs through “Post-Marketing Surveillance”. Over 200 drugs have been withdrawn from the market by the FDA due to safety concerns.

一旦完成临床前研究，研究人员便会着手进行临床药物开发，其中包括临床试验和志愿者研究，以微调供人类使用的药物。一旦新药以其最佳疗效和安全性被配制，并且可以从III期临床试验中获得结果，就可以进行FDA全面审查。在药品批准和上市后，FDA要求药品公司通过“上市后监督”持续监控其药品的安全性。出于安全考虑，FDA已经从市场上撤出了200多种药物。

Drug development is super tough simply because biology is complex, and we are dealing with a classic multiple variable optimisation problem. Traditional methods of drug development have been struggling on multiple fronts, highlighting some of the biggest concerns when it comes to drug development:

仅仅因为生物学很复杂，药物开发就变得异常艰难，我们正在处理经典的多变量优化问题。传统的药物开发方法已经在多个方面进行了努力，突显了药物开发方面的一些最大问题：

A) Lengthy, complex, and unpredictable processes

A) 冗长，复杂且不可预测的过程

It is a bit of an exaggeration, but even simple antibiotics that are discovered after complex evaluation become ineffective over time due to drug resistance and so there is an ever-increasing challenge of developing more effective new drugs. Increasing number of variables and data points has limited researchers from keeping up the pace of drug development which has traditionally ranged between 8–12 years per drug.

有点夸张，但是即使经过复杂评估后发现的简单抗生素，由于耐药性也会随着时间的流逝而失效，因此开发更有效的新药物的挑战日益增加。变量和数据点数量的增加限制了研究人员跟上药物开发的步伐，传统上，每种药物的开发时间在8-12年之间。

B) Ever-increasing cost of R&D

B) 不断增加的研发成本

In 2020, developing a new prescription medicine that gains marketing approval is estimated to cost drug makers $2.6 billion according to a recent study published in the Journal of Health Economics. This is up from $802 million in 2003 — equal to approximately $1 billion in 2013, and thus a 145 percent increase in the ten-year study gap.

根据发表在《健康经济学杂志》上的最新研究，到2020年，开发一种新的处方药获得市场认可，估计将使制药商损失26亿美元。这比2003年的8.02亿美元有所增加，相当于2013年的约10亿美元，因此十年学习差距增加了145％。

C) Existence of human bias

C) 人为偏见的存在

Researchers, no matter how hard they try, might often be limited by their personal preferences and biases. As a result, they may chase compounds and proteins based on their past experiences which may or may not guarantee success.

研究人员，无论他们多么努力，都可能经常受到个人偏好和偏见的限制。结果，他们可能会根据过去的经验来追踪化合物和蛋白质，这可能会或可能不会保证成功。

“Medicine is a science of uncertainty and an art of probability.” — William Osler

“医学是不确定性的科学，也是概率的艺术。” —威廉·奥斯勒

AI (Artificial Intelligence) is a versatile technology that can be applied ubiquitously at various stages of drug development. It not only counters the inefficiencies and uncertainties that arise in the classical drug development methods but also minimises bias that comes with human intervention.

AI(人工智能)是一种通用技术，可以广泛应用于药物开发的各个阶段。它不仅可以解决传统药物开发方法中出现的效率低下和不确定性问题，而且还可以最大程度地减少人为干预带来的偏见。

Highlighting some common themes and use cases where AI could be leveraged to produce more efficient and quicker results in drug discovery:

重点介绍一些常见的主题和用例，在这些主题和用例中，可以利用AI来在药物发现中产生更有效，更快的结果：

1) AI in finding and validating molecular targets

1)AI在寻找和验证分子靶标中

Drug discovery starts with identifying an effective target protein causing the disease. Traditionally, the selection of target protein was based on a researcher’s theory or hunch, which often was biased and overly restricted the pool of candidates. AI and natural language processing (NLP) has powered researchers with tools to scan vast tomes of medical literature, biochemical attributes and genetic datasets to help identify new biological targets.

药物发现始于确定引起疾病的有效靶蛋白。传统上，靶蛋白的选择基于研究者的理论或直觉，这通常是有偏见的，并且过度限制了候选人的数量。人工智能和自然语言处理(NLP)为研究人员提供了工具，可以扫描大量医学文献，生化属性和遗传数据集，以帮助识别新的生物学目标。

New age organisations are collecting and experimenting with tissue samples extracted from large populations, the data hence generated is run through deep-learning programs which highlights specific proteins that could play a role in many diseases. In some cases, these proteins end up becoming the target sites for drug candidates.

新时代的组织正在收集和试验从大量人群中提取的组织样本，因此产生的数据通过深度学习程序运行，该程序突出显示了可能在许多疾病中起作用的特定蛋白质。在某些情况下，这些蛋白质最终成为候选药物的靶位。

2) AI in finding the hit or lead

2)AI在寻找成功或领先

Once the target molecules are identified, researchers need to figure out compounds which could alter these proteins. The first compound that shows activity against a given biological target is called a ‘hit’ whereas a ‘lead’ is a chemical compound that shows promising potential leading to the development of a new drug.

一旦确定了目标分子，研究人员就需要找出可能改变这些蛋白质的化合物。表现出针对给定生物学目标的活性的第一个化合物称为“命中”，而“先导”是显示出有希望的潜力导致新药开发的化学化合物。

During the process of lead generation, hit molecules are systematically modified to improve their activity and selectivity towards specific biological targets, while reducing toxicity and side-effects. Companies have been using AI driven predictive models for identification and prioritisation of various target proteins, charting them to potential lead compounds. AI systems can reduce the attrition rates and the R&D expenditure by efficiently mapping lead compound libraries to their potency, selectivity, and binding affinity data sets.

在产生铅的过程中，对命中分子进行系统修饰，以提高其对特定生物靶标的活性和选择性，同时减少毒性和副作用。公司一直在使用AI驱动的预测模型来识别和区分各种目标蛋白的优先级，并将它们绘制成潜在的先导化合物。 AI系统可以通过有效地将先导化合物库映射到其效价，选择性和结合亲和力数据集来减少人员流失率和研发支出。

3) AI in Retrosynthesis of Target Compounds

3)目标化合物逆合成中的AI

After identification of the lead molecule pharmacists need to identify an effective route for the synthesis of these compounds. Chemists use retrosynthesis to identify optimal chemical reactions that leads to the formulation of such molecules. The first step in the retrosynthetic approach is to analyse the lead molecule and sequentially break it into smaller fragments or building blocks that must be targeted. The second step is to identify the reactions that will convert these fragments into target compounds. Neural networks and AI models can assist researchers to study a vast number of relevant organic reactions available in the literature for predicting retrosynthesis routes and assessing the synthesisability of a candidate.

鉴定了先导分子后，药剂师需要鉴定合成这些化合物的有效途径。化学家使用逆向合成来确定导致此类分子形成的最佳化学React。逆向合成方法的第一步是分析铅分子，然后将其依次分解为必须靶向的较小片段或结构单元。第二步是确定将这些片段转化为目标化合物的React。神经网络和AI模型可以帮助研究人员研究文献中可用的大量相关有机React，以预测逆合成途径和评估候选人的可合成性。

4) Predicting the Mode-of-Action of compounds using AI

4)使用AI预测化合物的作用方式

Once the target site, lead molecule and reaction mechanism are figured out, it is incredibly important to predict the on- and off-target effects and safety profile (Toxicity levels) of compounds to be synthesised. Companies have been using deep learning algorithms to study refined datasets on toxicity, chemical and therapeutic profiles of a varied set of compounds which is further used to predict efficacy and toxicity levels of the drug under development.

一旦确定了目标位点，铅分子和React机理，预测要合成的化合物的靶向作用和脱靶作用以及安全性概况(毒性水平)就变得异常重要。公司一直在使用深度学习算法来研究各种化合物的毒性，化学和治疗特性的精炼数据集，这些数据可进一步用于预测所开发药物的功效和毒性水平。

5) AI in Selection and Monitoring of a population for Clinical Trials

5)AI在临床试验人群的选择和监测中

Once a drug is ready for testing, patient selection forms an important starting point in the process of clinical trials. Many clinical trials fail due to poor recruitment and selection techniques, as well as an inability to effectively monitor patients. Cutting edge AI tools identify and predict relevant biomarkers of diseases that help enhance patient selection by reducing population heterogeneity.

一旦准备好进行测试的药物，患者的选择就成为临床试验过程中的重要起点。许多临床试验由于招募和选择技术不佳以及无法有效监控患者而失败。先进的AI工具可以识别和预测疾病的相关生物标志物，通过减少群体异质性来帮助增强患者选择的能力。

It also helps in choosing patients who are more likely to have a measurable clinical endpoint and identifying a population more capable of responding to treatment. Several new age companies have been successful in building workflows that automate the screening of EHRs (Electronic Health Records) and clinical trial eligibility databases which match specific patients with trial eligibility and recommend these matches to doctors and patients.

它还有助于选择更有可能具有可衡量的临床终点的患者，并确定更能对治疗产生React的人群。几家新时代公司已经成功建立了可自动筛选EHR(电子健康记录)和临床试验资格数据库的工作流程，这些数据库可将特定患者与试验资格进行匹配，并将这些匹配推荐给医生和患者。

6) AI in Drug Repurposing

6)药物再利用中的AI

Biopharmaceutical companies prefer repurposing previously known drugs or late-stage drug candidates towards new therapeutic areas as it presents lower risks of toxicity and likely, lesser expenditure on R&D. Repurposed Drugs go directly to Phase II trials as their safety and efficacy profiles are largely known. Companies can, hence, use neural network-based algorithms to scan through large clinical data resources of existing drugs to find novel drug candidates, and biomarkers predictive of diseases.

生物制药公司更喜欢将先前已知的药物或晚期候选药物用于新的治疗领域，因为它具有较低的毒性风险，并且可能减少了研发支出。重新用途的药物直接进入II期临床试验，因为其安全性和功效概况已广为人知。因此，公司可以使用基于神经网络的算法来扫描现有药物的大量临床数据资源，以找到新的候选药物以及可预测疾病的生物标记。

7) AI in Polypharmacology

7)多元药理学中的AI

While “one disease — one target” has been a dominating paradigm in drug discovery for years, it is becoming obvious that many diseases are too complex to be efficiently cured within this paradigm. Complex diseases like Cancer, Obesity and Depression need ‘one-disease–multiple-targets’ and sometimes multiple-diseases have only one target-cause. So, it is essential to map the network of signatures of a disease with the drug-signatures. Owing to the enormous amounts of data and correlations to be processed, it is extremely difficult for humans alone to design these drugs. AI tools can be used to probe complex and vast literature for designing polypharmacological agents.

尽管“一种疾病-一个目标”多年来一直是药物发现中的主要范例，但显而易见的是，许多疾病过于复杂，无法在该范例中有效治愈。癌症，肥胖症和抑郁症等复杂疾病需要“一病多靶”，有时多病只有一个靶标。因此，必须将疾病的特征网络与药物特征进行映射。由于要处理的数据量和相关性非常大，仅凭人类一个人就很难设计这些药物。 AI工具可用于探查设计多元药物制剂的复杂文献。

Absence of clear regulations, poor quality of medical data and lack of trust on machines have proved to be major bottlenecks in blitz scaling the use of AI in drug discovery. That being said all pharma players share a common goal of saving patient lives, and there have been early signs of AI playing a pivotal role in making good on this shared mission. The ecosystem is highly confident that AI should make the drug development process faster, cheaper and free from human bias.

事实证明，缺乏明确的法规，医疗数据质量较差以及对机器的信任度不足，这是Swift扩大在药物开发中使用AI的主要瓶颈。话虽这么说，所有制药公司都有一个共同的目标，那就是挽救患者的生命，并且有早期迹象表明，人工智能在履行这一共同使命中发挥着举足轻重的作用。生态系统非常有信心AI可以使药物开发过程更快，更便宜并且不受人类偏见的影响。

About the author: I am currently a part of the investment team at Speciale Invest. Have previously worked with Flipkart and hold an engineering degree from BITS Pilani. Can be reached out at Anirudh.garg@specialeinvest.com and https://www.linkedin.com/in/anirudh-garg98/

关于作者： 我目前是Speciale Invest投资团队的一员。之前曾在Flipkart工作，并拥有BITS Pilani的工程学学位。可以通过Anirudh.garg@specialeinvest.com和https://www.linkedin.com/in/anirudh-garg98/与我们联系。

About Speciale Invest: Speciale Invest is a deep science and technology venture firm investing across enterprise software (AR/VR, Cloud, Voice AI, Vision AI, Computer Vision) and industrial hardware (propulsion tech, robotics, rocket engines, lithium tech, micro-electronics, photonics). We are typically first institutional investors and have been early pioneers of AI SaaS, Electric Mobility, Space Tech, Vision based Robotics and Photonics.

关于Speciale Invest： Speciale Invest是一家深度科技风险投资公司，投资于企业软件(AR / VR，云，语音AI，Vision AI，计算机视觉)和工业硬件(推进技术，机器人技术，火箭发动机，锂技术，微型技术) -电子学，光子学)。我们通常是第一批机构投资者，并且是AI SaaS，电动交通，太空技术，基于视觉的机器人技术和光子学的早期开拓者。