【AI翻译】加速科学突破,与 AI 协作科学家(co-scientist)合作

Accelerating scientific breakthroughs with an AI co-scientist

加速科学突破,与 AI 共同科学家合作

February 19, 2025

Juraj Gottweis, Google Fellow, and Vivek Natarajan, Research Lead

  2025 年 2 月 19 日,Google Fellow Juraj Gottweis 和研究主管 Vivek Natarajan

We introduce AI co-scientist, a multi-agent AI system built with Gemini 2.0 as a virtual scientific collaborator to help scientists generate novel hypotheses and research proposals, and to accelerate the clock speed of scientific and biomedical discoveries.
我们推出 AI 共同科学家,这是一个以 Gemini 2.0 构建的多智能体 AI 系统,作为虚拟科学合作者,帮助科学家生成新颖的假设和研究提案,并加速科学和生物医学发现的进程。

In the pursuit of scientific advances, researchers combine ingenuity and creativity with insight and expertise grounded in literature to generate novel and viable research directions and to guide the exploration that follows. In many fields, this presents a breadth and depth conundrum, since it is challenging to navigate the rapid growth in the rate of scientific publications while integrating insights from unfamiliar domains. Yet overcoming such challenges is critical, as evidenced by the many modern breakthroughs that have emerged from transdisciplinary endeavors. For example, Emmanuelle Charpentier and Jennifer Doudna won the 2020 Nobel Prize in Chemistry for their work on CRISPR, which combined expertise ranging from microbiology to genetics to molecular biology.
在追求科学进步的过程中,研究者们将独创性与创造力同基于文献的洞察力和专业知识相结合,以产生新颖且可行的研究方向,并指导后续的探索。在许多领域,这带来了广度与深度的难题,因为要在科学出版物数量迅速增长的同时,整合来自不熟悉领域的见解颇具挑战。然而,克服这些挑战至关重要,正如许多现代突破所证明的那样,这些突破往往源自跨学科的努力。例如,埃马纽埃尔·夏彭蒂耶和詹妮弗·杜德纳因在 CRISPR 技术上的工作而荣获 2020 年诺贝尔化学奖,该技术融合了从微生物学到遗传学再到分子生物学的广泛专业知识。

Motivated by unmet needs in the modern scientific discovery process and building on recent AI advances, including the ability to synthesize across complex subjects and to perform long-term planning and reasoning, we developed an AI co-scientist system. The AI co-scientist is a multi-agent AI system that is intended to function as a collaborative tool for scientists. Built on Gemini 2.0, AI co-scientist is designed to mirror the reasoning process underpinning the scientific method. Beyond standard literature review, summarization and “deep research” tools, the AI co-scientist system is intended to uncover new, original knowledge and to formulate demonstrably novel research hypotheses and proposals, building upon prior evidence and tailored to specific research objectives.
受现代科学发现过程中未满足需求的驱动,并基于近期人工智能的进步,包括跨复杂主题的综合能力以及长期规划和推理能力,我们开发了一个 AI 合作科学家系统。该 AI 合作科学家是一个多智能体 AI 系统,旨在作为科学家的协作工具。基于 Gemini 2.0 构建,AI 合作科学家旨在模拟支撑科学方法的推理过程。除了标准的文献回顾、总结和“深度研究”工具外,AI 合作科学家系统旨在发现新的原创知识,并根据先前的证据,针对特定的研究目标,制定可证明新颖的研究假设和提案。

Empowering scientists and accelerating discoveries with the AI co-scientist
赋能科学家,借助 AI 协科学家加速发现

Given a scientist’s research goal that has been specified in natural language, the AI co-scientist is designed to generate novel research hypotheses, a detailed research overview, and experimental protocols. To do so, it uses a coalition of specialized agents — GenerationReflectionRankingEvolutionProximity and Meta-review — that are inspired by the scientific method itself. These agents use automated feedback to iteratively generate, evaluate, and refine hypotheses, resulting in a self-improving cycle of increasingly high-quality and novel outputs.
给定科学家以自然语言明确的研究目标,AI 协科学家旨在生成新颖的研究假设、详细的研究概述及实验方案。为此,它采用了一组受科学方法启发的专门代理——生成、反思、排序、进化、邻近及元评审——这些代理利用自动化反馈,迭代地生成、评估并精炼假设,从而形成一个自我提升的循环,产出质量日益提高且新颖的成果。

Purpose-built for collaboration, scientists can interact with the system in many ways, including by directly providing their own seed ideas for exploration or by providing feedback on generated outputs in natural language. The AI co-scientist also uses tools, like web-search and specialized AI models, to enhance the grounding and quality of generated hypotheses.
专为协作而设计,科学家们可以通过多种方式与系统互动,包括直接提供自己的探索种子想法,或以自然语言对生成的输出提供反馈。AI 合作科学家还利用网络搜索和专门的 AI 模型等工具,以增强生成假设的基础性和质量。

4321fb0210ee89484589526b35ff24f2.png

AICoScientist-1-Components

Illustration of the different components in the AI co-scientist multi-agent system and the interaction paradigm between the system and the scientist.
AI 合作科学家多代理系统中不同组件的图示以及系统与科学家之间的交互范式。

The AI co-scientist parses the assigned goal into a research plan configuration, managed by a Supervisor agent. The Supervisor agent assigns the specialized agents to the worker queue and allocates resources. This design enables the system to flexibly scale compute and to iteratively improve its scientific reasoning towards the specified research goal.
AI 联合科学家将指定目标解析为研究计划配置,由 Supervisor 代理管理。Supervisor 代理将专业代理分配到工作队列并分配资源。该设计使系统能够灵活扩展计算能力,并针对指定研究目标迭代改进其科学推理。

796ead5e73baeb93a041301fdad45c6f.png

AI co-scientist system overview. Specialized agents (red boxes, with unique roles and logic); scientist input and feedback (blue boxes); system information flow (dark gray arrows); inter-agent feedback (red arrows within the agent section).
AI 联合科学家系统概览。专用代理(红色方框,具有独特角色和逻辑);科学家输入与反馈(蓝色方框);系统信息流(深灰色箭头);代理间反馈(代理部分内的红色箭头)。

Scaling test-time compute for advanced scientific reasoning
扩展测试时间计算以进行高级科学推理

The AI co-scientist leverages test-time compute scaling to iteratively reason, evolve, and improve outputs. Key reasoning steps include self-play–based scientific debate for novel hypothesis generation, ranking tournaments for hypothesis comparison, and an "evolution" process for quality improvement. The system's agentic nature facilitates recursive self-critique, including tool use for feedback to refine hypotheses and proposals.
AI 联合科学家利用测试时计算扩展进行迭代推理、演进和改进输出。关键推理步骤包括基于自我对战的科学辩论以生成新假设、用于假设比较的排名锦标赛,以及用于质量提升的“进化”过程。系统的代理性质促进了递归自我批判,包括使用工具获取反馈以优化假设和提案。

The system's self-improvement relies on the Elo auto-evaluation metric derived from its tournaments. Due to their core role, we assessed whether higher Elo ratings correlate with higher output quality. We analyzed the concordance between Elo auto-ratings and GPQA benchmark accuracy on its diamond set of challenging questions, and we found that higher Elo ratings positively correlate with a higher probability of correct answers.
系统的自我改进依赖于从其比赛中得出的 Elo 自动评估指标。鉴于其核心作用,我们评估了更高的 Elo 评分是否与更高的输出质量相关。我们分析了 Elo 自动评分与 GPQA 基准在具有挑战性的钻石问题集上的准确性之间的一致性,发现更高的 Elo 评分与正确答案的概率呈正相关。

AICoScientist-3-Elo

Average accuracy of the AI co-scientist (blue line) and reference Gemini 2.0 (red line) responses on GPQA diamond questions, grouped by Elo rating. The Elo is an auto-evaluation and is not based on an independent ground truth.
AI 共同科学家(蓝线)和参考 Gemini 2.0(红线)在 GPQA 钻石问题上的平均准确率,按 Elo 评分分组。Elo 为自动评估,不基于独立基准真相。

Seven domain experts curated 15 open research goals and best guess solutions in their field of expertise. Using the automated Elo metric we observed that the AI co-scientist outperformed other state-of-the-art agentic and reasoning models for these complex problems. The analysis reproduced the benefits of scaling test-time compute using inductive biases derived from the scientific method. As the system spends more time reasoning and improving, the self-rated quality of results improve and surpass models and unassisted human experts.
七位领域专家在其专业领域内精心策划了 15 个开放研究目标及最佳猜测解决方案。通过自动化 Elo 评分系统,我们观察到 AI 合作科学家在这些复杂问题上超越了其他最先进的代理和推理模型。该分析再现了利用科学方法衍生的归纳偏差扩展测试时计算资源带来的益处。随着系统投入更多时间进行推理和改进,其自我评估的结果质量不断提升,并超越了模型及无辅助的人类专家。

3ab778f9e3ce45a670fdbc2d0d29de81.png

f34d4cf0df318c8e0429cfce2b3780be.png

Performance of the AI co-scientist improves as the system spends more time in computation. This can be seen in the automated Elo metric gradually improving over other baselines. Top: Elo progression of the best rated hypothesis. Bottom: Elo progression of the average of top-10 hypotheses.
AI 合作科学家的性能随着系统在计算上花费更多时间而提高。这可以从自动化的 Elo 指标逐渐超越其他基线中看出。上图:评分最高的假设的 Elo 进展。下图:前 10 个假设平均值的 Elo 进展。

On a smaller subset of 11 research goals, experts assessed the novelty and impact of the AI co-scientist–generated results compared to other relevant baselines; they also provided overall preference. While the sample size was small, experts assessed the AI co-scientist to have higher potential for novelty and impact, and preferred its outputs compared to other models. Further, these human expert preferences also appeared to be concordant with the previously introduced Elo auto-evaluation metric.
在 11 个研究目标的较小子集上,专家们评估了 AI 合作科学家生成结果的新颖性和影响力,与其他相关基线进行了比较;他们还提供了总体偏好。尽管样本量较小,专家们认为 AI 合作科学家在新颖性和影响力方面具有更高的潜力,并更倾向于其输出与其他模型相比。此外,这些人类专家的偏好似乎也与之前引入的 Elo 自动评估指标一致。

388cd6ee7491201b88962178d1d9e793.png

403f1768735fe702f042259dbd3fc6ac.png

Human experts assessed the AI co-scientist results to have higher potential for novelty and impact (left) and preferred it compared to other models (right).
人类专家评估认为,AI 合作科学家的成果在创新性和影响力方面具有更高的潜力(左图),并且与其他模型相比更受青睐(右图)。

Validation of novel AI co-scientist hypotheses with real-world laboratory experiments
通过真实世界的实验室实验验证新型 AI 合作科学家假设

To assess the practical utility of the system’s novel predictions, we evaluated end-to-end laboratory experiments probing the AI co-scientist–generated hypotheses and research proposals in three key biomedical applications: drug repurposing, proposing novel treatment targets, and elucidating the mechanisms underlying antimicrobial resistance. These settings all involved expert-in-the-loop guidance and spanned an array of complexities:
为了评估该系统新颖预测的实际效用,我们进行了端到端的实验室实验,检验了 AI 合作科学家生成的假设和研究提案在三个关键生物医学应用中的表现:药物再利用、提出新的治疗靶点以及阐明抗菌素耐药性的机制。这些实验均涉及专家在环指导,并涵盖了一系列复杂性。

d5c0e8709b9cbe31fd6dc276b62d8385.png

Drug repurposing for acute myeloid leukaemia
急性髓性白血病的药物再利用

Drug development is an increasingly time-consuming and expensive process in which new therapeutics require many aspects of the discovery and development process to be restarted for each indication or disease. Drug repurposing addresses this challenge by discovering new therapeutic applications for existing drugs beyond their original intended use. But, due to the complexity of the task, it demands extensive interdisciplinary expertise.
药物开发是一个日益耗时且昂贵的过程,其中新疗法需要针对每种适应症或疾病重新启动发现和开发过程的多个方面。药物再利用通过发现现有药物在原始用途之外的新治疗应用来应对这一挑战。但由于任务的复杂性,它需要广泛的跨学科专业知识。

We applied the AI co-scientist to assist with the prediction of drug repurposing opportunities and, with our partners, validated predictions through computational biology, expert clinician feedback, and in vitro experiments.
我们应用 AI 联合科学家来协助预测药物再利用机会,并与合作伙伴一起,通过计算生物学、临床专家反馈和体外实验验证了这些预测。

Notably, the AI co-scientist proposed novel repurposing candidates for acute myeloid leukemia (AML). Subsequent experiments validated these proposals, confirming that the suggested drugs inhibit tumor viability at clinically relevant concentrations in multiple AML cell lines.
值得注意的是,这位 AI 合作科学家提出了急性髓性白血病(AML)的新型药物再利用候选方案。随后的实验验证了这些提议,证实所建议的药物在临床相关浓度下能够抑制多种 AML 细胞系中的肿瘤活力。

9b7d02377f75a9de4b50b566b80a03a7.png

Dose-response curves of one of the three novel AI co-scientist–predicted AML repurposing drugs. KIRA6 inhibits KG-1 (AML cell line) viability at clinically relevant concentrations. Being able to reduce cancer cell viability at lower drug concentrations is advantageous for multiple reasons, e.g., as it reduces the potential for off-target side effects.
三种新型 AI 合作科学家预测的 AML 再利用药物之一的剂量-反应曲线。KIRA6 在临床相关浓度下抑制 KG-1(AML 细胞系)的生存能力。能够在较低药物浓度下降低癌细胞生存能力具有多重优势,例如,它减少了脱靶副作用的潜在风险。

Advancing target discovery for liver fibrosis
推进肝纤维化的靶点发现

Identifying novel treatment targets is more complex than drug repurposing, and often leads to inefficient hypothesis selection and poor prioritization for in vitro and in vivo experiments. AI-assisted target discovery helps to streamline the process of experimental validation, potentially helping to reduce development time costs.
识别新的治疗靶点比药物再利用更为复杂,常常导致假设选择效率低下以及体外和体内实验的优先级排序不佳。AI 辅助的靶点发现有助于简化实验验证过程,可能有助于减少开发时间和成本。

We probed the AI co-scientist system's ability to propose, rank, and generate hypotheses and experimental protocols for target discovery hypotheses, focusing on liver fibrosis. The AI co-scientist demonstrated its potential by identifying epigenetic targets grounded in preclinical evidence with significant anti-fibrotic activity in human hepatic organoids (3D, multicellular tissue cultures derived from human cells and designed to mimic the structure and function of the human liver). These findings will be detailed in an upcoming report led by collaborators at Stanford University.
我们探究了 AI 辅助科学家系统在提出、排序和生成假设及实验方案方面的能力,重点关注肝纤维化的靶点发现假设。AI 辅助科学家通过识别基于临床前证据的表观遗传靶点,展示了其潜力,这些靶点在人类肝脏类器官(3D,多细胞组织培养物,源自人类细胞,旨在模拟人类肝脏的结构和功能)中具有显著的抗纤维化活性。这些发现将在斯坦福大学合作者领导的一份即将发布的报告中详细阐述。

1948b671fbc028d4ec5b07c929a9d450.png

Comparison of treatments derived from AI co-scientist–suggested liver fibrosis targets versus a fibrosis inducer (negative control) and an inhibitor (positive control). All treatments suggested by AI co-scientist show promising activity (p-values for all suggested drugs are <0.01), including candidates that possibly reverse a disease phenotype. Results are detailed in an upcoming report from our Stanford University collaborators.
AI 共同科学家建议的肝纤维化靶点治疗与纤维化诱导剂(阴性对照)和抑制剂(阳性对照)的比较。AI 共同科学家建议的所有治疗方法均显示出有前景的活性(所有建议药物的 p 值均<0.01),包括可能逆转疾病表型的候选药物。结果详见我们斯坦福大学合作者即将发布的报告。

Explaining mechanisms of antimicrobial resistance
解释抗菌素耐药性的机制

As a third validation, we focused on generating hypotheses to explain bacterial gene transfer evolution mechanisms related to antimicrobial resistance (AMR) — microbes' evolved mechanisms to resist infection-treating drugs. This is another complex challenge that involves understanding the molecular mechanisms of gene transfer (conjugation, transduction, and transformation) alongside the ecological and evolutionary pressures that drive AMR genes to spread.
作为第三次验证,我们专注于生成假设,以解释与抗菌素耐药性(AMR)相关的细菌基因转移进化机制——微生物进化出的抵抗感染治疗药物的机制。这是另一个复杂的挑战,涉及理解基因转移的分子机制(接合、转导和转化)以及推动 AMR 基因传播的生态和进化压力。

For this test, expert researchers instructed the AI co-scientist to explore a topic that had already been subject to novel discovery in their group, but had not yet been revealed in the public domain, namely, to explain how capsid-forming phage-inducible chromosomal islands (cf-PICIs) exist across multiple bacterial species. The AI co-scientist system independently proposed that cf-PICIs interact with diverse phage tails to expand their host range. This in silico discovery, which had been experimentally validated in the original novel laboratory experiments performed prior to use of the AI co-scientist system, are described in co-timed manuscripts (1, 2) with our collaborators at the Fleming Initiative and Imperial College London. This illustrates the value of the AI co-scientist system as an assistive technology, as it was able to leverage decades of research comprising all prior open access literature on this topic.
在这次测试中,专家研究人员指示 AI 合作科学家探索一个在其团队中已有新发现但尚未公开的主题,即解释形成衣壳的噬菌体诱导染色体岛(cf-PICIs)如何存在于多种细菌物种中。AI 合作科学家系统独立提出,cf-PICIs 与多种噬菌体尾部相互作用以扩大其宿主范围。这一在计算机模拟中的发现,在 AI 合作科学家系统使用之前已在原始实验室实验中得到验证,并在与 Fleming Initiative 和伦敦帝国理工学院的合作者共同撰写的同步手稿(1, 2)中进行了描述。这展示了 AI 合作科学家系统作为辅助技术的价值,因为它能够利用数十年的研究,包括所有先前关于该主题的开放获取文献。

42bd428cddc07870626ccc9ee34042e6.png

Timeline of AI co-scientist re-discovery of a novel gene transfer mechanism. Blue: Experimental research pipeline timeline for cf-PICI mobilization discovery. Red: AI co-scientist development and recapitulation of these key findings (without prior knowledge).
AI 共同科学家重新发现新型基因转移机制的时间线。蓝色:cf-PICI 动员发现的实验研究流程时间线。红色:AI 共同科学家的发展及对这些关键发现的复现(无先验知识)。

Limitations and outlook  局限性与展望

In our report we address several limitations of the system and opportunities for improvement, including enhanced literature reviews, factuality checking, cross-checks with external tools, auto-evaluation techniques, and larger-scale evaluation involving more subject matter experts with varied research goals. The AI co-scientist represents a promising advance toward AI-assisted technologies for scientists to help accelerate discovery. Its ability to generate novel, testable hypotheses across diverse scientific and biomedical domains — some already validated experimentally — and its capacity for recursive self-improvement with increased compute, demonstrate its potential to accelerate scientists' efforts to address grand challenges in science and medicine. We look forward to responsible exploration of the potential of the AI co-scientist as an assistive tool for scientists. This project illustrates how collaborative and human-centred AI systems might be able to augment human ingenuity and accelerate scientific discovery.
在我们的报告中,我们探讨了该系统的若干局限性和改进机会,包括加强文献综述、事实核查、与外部工具的交叉验证、自动评估技术,以及涉及更多具有不同研究目标的领域专家的大规模评估。AI 合作科学家代表了向科学家提供 AI 辅助技术以加速发现的有前景的进展。它能够在多样化的科学和生物医学领域生成新颖、可测试的假设——其中一些已通过实验验证——以及随着计算能力增强而实现递归自我改进的能力,展示了其加速科学家应对科学与医学重大挑战的潜力。我们期待负责任地探索 AI 合作科学家作为科学家辅助工具的潜力。该项目展示了协作且以人为本的 AI 系统如何能够增强人类创造力并加速科学发现。

Announcing Trusted Tester access to the AI co-scientist system
宣布 Trusted Tester 可访问 AI 协科学家系统

We are excited by the early promise of the AI co-scientist system and believe it is important to evaluate its strengths and limitations in science and biomedicine more broadly. To facilitate this responsibly we will be enabling access to the system for research organizations through a Trusted Tester Program. We encourage interested research organizations around the world to consider joining this program here.
我们对 AI 联合科学家系统的早期潜力感到兴奋,并认为更广泛地评估其在科学和生物医学领域的优势与局限性至关重要。为了负责任地推进这一目标,我们将通过“可信测试者计划”向研究机构开放系统访问权限。我们鼓励全球感兴趣的研究机构考虑在此加入该计划。

Acknowledgements  致谢

The research described here is a joint effort between many Google Research, Google Deepmind and Google Cloud AI teams. We thank our co-authors at Fleming Initiative and Imperial College London, Houston Methodist Hospital, Sequome, and Stanford University — José R Penadés, Tiago R D Costa, Vikram Dhillon, Eeshit Dhaval Vaishnav, Byron Lee, Jacob Blum and Gary Peltz. We appreciate Subhashini Venugopalan and Yun Liu for their detailed feedback on the manuscripts described here. We are also grateful to the many incredible scientists across institutions providing detailed technical and expert feedback — please refer to our report to see the voices and minds that aided this work. We also thank our teammates Resham Parikh, Taylor Goddu, Siyi Kou, Rachelle Sico, Amanda Ferber, Cat Kozlowski, Alison Lentz, KK Walker, Roma Ruparel, Jenn Sturgeon, Lauren Winer, Juanita Bawagan, Tori Milner, MK Blake, Kalyan Pamarthy for their support. Finally, we also thank John Platt, Michael Brenner, Zoubin Ghahramani, Dale Webster, Joelle Barral, Michael Howell, Susan Thomas, Jason Freidenfelds, Karen DeSalvo, Vladimir Vuskovic, Greg Corrado, Ronit Levavi Morad, Ali Eslami, Anna Koivuniemi, Royal Hansen, Andy Berndt, Noam Shazeer, Oriol Vinyals, Burak Gokturk, Amin Vahdat, Katherine Chou, Avinatan Hassidim, Koray Kavukcuoglu, Pushmeet Kohli, Yossi Matias, James Manyika, Jeff Dean and Demis Hassabis for their support.
此处描述的研究是 Google Research、Google Deepmind 和 Google Cloud AI 多个团队共同努力的成果。我们感谢 Fleming Initiative 和伦敦帝国理工学院、休斯顿卫理公会医院、Sequome 以及斯坦福大学的合著者——José R Penadés、Tiago R D Costa、Vikram Dhillon、Eeshit Dhaval Vaishnav、Byron Lee、Jacob Blum 和 Gary Peltz。我们感谢 Subhashini Venugopalan 和 Yun Liu 对本文稿的详细反馈。我们也感激来自各机构的众多杰出科学家提供的详细技术和专家反馈——请参阅我们的报告以了解支持这项工作的声音和思想。我们还要感谢我们的队友 Resham Parikh、Taylor Goddu、Siyi Kou、Rachelle Sico、Amanda Ferber、Cat Kozlowski、Alison Lentz、KK Walker、Roma Ruparel、Jenn Sturgeon、Lauren Winer、Juanita Bawagan、Tori Milner、MK Blake、Kalyan Pamarthy 的支持。 最后,我们还要感谢 John Platt、Michael Brenner、Zoubin Ghahramani、Dale Webster、Joelle Barral、Michael Howell、Susan Thomas、Jason Freidenfelds、Karen DeSalvo、Vladimir Vuskovic、Greg Corrado、Ronit Levavi Morad、Ali Eslami、Anna Koivuniemi、Royal Hansen、Andy Berndt、Noam Shazeer、Oriol Vinyals、Burak Gokturk、Amin Vahdat、Katherine Chou、Avinatan Hassidim、Koray Kavukcuoglu、Pushmeet Kohli、Yossi Matias、James Manyika、Jeff Dean 和 Demis Hassabis 的支持。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值