简单整理了一下单细胞大模型相关的文章,也方便自己刷手机的时候浏览下。
有需要的朋友可以点个收藏~欢迎补充!
留意更多内容欢迎关注微信公众号:组学之心
1.iSEEEK
[Briefings in Bioinformatics]
https://doi.org/10.1093/bib/bbab573
代码:https://github.com/lixiangchun/iSEEEK
数据:scRNA-seq,11.9 million
下游任务:基因调控网络、扩散拟时序、细胞聚类、细胞亚型biomarker鉴定
2.scBERT
[nature machine intelligence]
https://doi.org/10.1038/s42256-022-00534-z
代码:https://github.com/TencentAILabHealthcare/scBERT
数据:scRNA-seq,1 million
下游任务:细胞类型注释、发现新细胞亚型
3.BIOFORMERS
[bioRxiv]
https://doi.org/10.1101/2023.11.29.569320
代码:https://github.com/BiostateAI/Bioformers-BERT
数据:scRNA-seq
下游任务:基因调控网络、细胞聚类、分子表达建模
4.TOSICA
[Nature Communications]
https://doi.org/10.1038/s41467-023-35923-4
代码:https://github.com/JackieHanLab/TOSICA
数据:scRNA-seq
下游任务:细胞类型注释、发现新细胞类型、基于模型的数据挖掘
5.scPoli
[nature methods]
https://doi.org/10.1038/s41592-023-02035-2
代码:http://github.com/theislab/scPoli_reproduce
数据:scRNA-seq,7.8 million
下游任务:数据整合、细胞类型注释
6.CellPolaris
[bioRxiv]
https://doi.org/10.1101/2023.09.25.559244
代码:https://github.com/xCompass-AI/CellPolaris
数据:scRNA-seq,RNA-seq,ATAC-seq
下游任务:基因调控网络
7.scInterpreter
[BMC Bioinformatics]
https://doi.org/10.1186/s12859-023-05579-4
代码:无
数据:scRNA-seq
下游任务:细胞类型注释
8.scHyena
[arXiv]
https://doi.org/10.1186/s12859-023-05579-4
代码:https://github.com/scHyena2023/scHyena
数据:scRNA-seq
下游任务:细胞类型分类,scRNA-seq插值
9.SCimilarity
[bioRxiv]
https://doi.org/10.1101/2023.07.18.549537
代码:https://github.com/Genentech/scimilarity
数据:scRNA-seq,22.7 million
下游任务:结合了细胞语料库数据,能够查询细胞类型及状态信息
10.scTransSort
[biomolecules]
https://www.mdpi.com/2218-273X/13/4/611
代码:https://github.com/jiaojiao-123/scTransSort
数据:scRNA-seq
下游任务:细胞类型注
11.scCLIP
[OpenReview.net]
https://openreview.net/forum?id=KMtM5ZHxct&referrer=%5Bthe%20profile%20of%20Tianlong%20Chen%5D(%2Fprofile%3Fid%3D~Tianlong_Chen1)
代码:https://github.com/jsxlei/scCLIP
数据:scRNA-seq,scATAC-seq,377k
下游任务:多模态嵌入
12.Cell2Sentence
[OpenReview.net]
https://openreview.net/forum?id=EWt5wsEdvc&referrer=%5Bthe%20profile%20of%20Josue%20Ortega%20Caro%5D(%2Fprofile%3Fid%3D~Josue_Ortega_Caro1)
代码:https://github.com/vandijklab/cell2sentence-ft
数据:scRNA-seq,40k
下游任务:细胞类型注释
13.scTranslator
[bioRxiv]
https://www.biorxiv.org/content/10.1101/2023.07.04.547619v2.full
代码:https://github.com/TencentAILabHealthcare/sctranslator
数据:scRNA-seq,CITE-seq,bulk RNA-seq
下游任务:跨模态预测、基因调控网络、细胞聚类
14.scELMo
[bioRxiv]
https://doi.org/10.1101/2023.12.07.569910
代码:https://github.com/HelloWorldLTY/scELMo
数据:scRNA-seq,CITE-seq
下游任务:细胞类型注释、遗传扰动效应预测、其他扰动模型中的细胞和基因嵌入
15.UCE
[bioRxiv]
https://doi.org/10.1101/2023.11.28.568918
代码:https://github.com/snap-stanford/uce
数据:scRNA-seq,36 million
下游任务:细胞类型注释、发现新细胞类型
16.seq2cells
[bioRxiv]
https://doi.org/10.1101/2023.07.26.550634
代码:https://github.com/lucidrains/enformer-pytorch for seq2emb module
数据:scRNA-seq
下游任务:DNA 序列预测基因表达、细胞变异效应预测
17.tGPT
[iScience]
https://www.cell.com/iscience/pdf/S2589-0042(23)00613-2.pdf
代码:https://github.com/deeplearningplus/tGPT
数据:scRNA-seq,22 million
下游任务:细胞聚类、轨迹推断
18.CellLM
[arXiv]
https://arxiv.org/abs/2306.04371
代码:https://github.com/PharMolix/OpenBioMed
数据:scRNA-seq,1.8 million
下游任务:非疾病与癌症预测、细胞类型注释、药物反应预测
19.scMoFormer
[ACM]
https://dl.acm.org/doi/10.1145/3583780.3615061
代码:https://github.com/OmicsML/scMoFormer
数据:scRNA-seq, scATAC-seq, CITE-seq
下游任务:跨模态预测
20.GeneformerNature
[Nature]
https://www.nature.com/articles/s41586-023-06139-9
代码:https://huggingface.co/ctheodoris/Geneformer
数据:scRNA-seq, 36 million
下游任务:基因功能预测,细胞注释,细胞聚类,基因调控网络推断
21.MuSe-GNN
[arXiv]
https://arxiv.org/abs/2310.02275
代码:https://github.com/HelloWorldLTY/MuSe-GNN
数据:scRNA-seq, scATAC-seq,spatial data
下游任务:跨模态功能相似性的基因表示,基因功能预测
22.CellPLM
[ICLR]
https://openreview.net/forum?id=BKXvPDekud
代码:https://github.com/OmicsML/CellPLM
数据:scRNA-seq, Spatial transcriptomics,11 million
下游任务:基因表达估算、细胞类型注释、遗传扰动效应预测细胞聚类、scRNA-seq 去噪
23.GeneCompass
[bioRxiv]
https://www.biorxiv.org/content/10.1101/2023.09.26.559542v1
代码:https://github.com/xCompass-AI/GeneCompass
数据: scRNA-seq,126 million
下游任务:细胞类型注释、药物反应预测、基因功能预测跨物种整合、遗传扰动效应预测、基因调控网络推断
24.scMulan
[bioRxiv]
https://doi.org/10.1101/2024.01.25.577152
代码:https://github.com/SuperBianC/scMulan/tree/main
数据: scRNA-seq,10 million
下游任务:细胞类型注释,细胞meta数据注释(训练中也用到),数据整合
25.GENEPT
[bioRxiv]
https://doi.org/10.1101/2023.10.16.562533
代码:https://github.com/yiqunchen/GenePT
数据: scRNA-seq
下游任务:基因功能预测细胞聚类、基因调控网络推断
26.GENEPT
[bioRxiv]
https://doi.org/10.1101/2024.01.30.578115
代码:无
数据: scRNA-seq
下游任务:命名实体识别 (NER)、细胞-生物标志物句子分类
27.CELLama
[bioRxiv]
https://doi.org/10.1101/2024.05.08.593094
代码:https://github.com/portrai-io/CELLama
数据: scRNA-seq, spatial transcriptomics,
下游任务:细胞类型注释
28.scGPT
[nature methods]
https://www.nature.com/articles/s41592-024-02201-0
代码:https://github.com/bowang-lab/scGPT
数据: scRNA-seq, scATAC-seq, CITE-seq, Spatial transcriptomics,33 million
下游任务:细胞类型注释、基因扰动效应预测、逆扰动预测、细胞聚类、多模态嵌入、基因功能预测细胞聚类、基因调控网络推断、模拟、基因表达归纳
29.GET
[bioRxiv]
https://doi.org/10.1101/2023.09.24.559168
代码:https://huggingface.co/get-foundation
数据: scRNA-seq, scATAC-seq
下游任务:零样本预测lentiMPRA,鉴定细胞类型特异性调节元件和上游转录因子(转录因子),转录因子-转录因子相互作用及因果推断。
30.scFoundation
[nature methods]
https://www.nature.com/articles/s41592-024-02305-7
代码:https://github.com/biomap-research/scFoundation
数据: scRNA-seq,50 million
下游任务:药物反应预测,基因扰动效应预测,读取深度增强,细胞聚类
31.scPRINT
[bioRxiv]
https://doi.org/10.1101/2024.07.29.605556
代码:https://github.com/cantinilab/scPRINT
数据: scRNA-seq,50 million
下游任务:细胞标签预测(这些监督任务是预训练的一部分),读取深度增强,基因表达插补,批次整合,细胞聚类,细胞标签预测,基因调控网络推理
32.scMAE
[bioRxiv]
https://doi.org/10.1101/2024.02.13.580114
代码:无
数据: single-cell flow cytometry,6.5 million
下游任务:细胞类型注释、蛋白质表达估算
33.SATURN
[nature methods]
https://doi.org/10.1038/s41592-024-02191-z
代码:https://github.com/snap-stanford/saturn
数据: scRNA-seq,蛋白质序列
下游任务:跨物种细胞类型注释,差异表达宏基因鉴定,确认跨物种之间的细胞类型标签,鉴定差异蛋白
34.Nicheformer
[bioRxiv]
https://doi.org/10.1101/2024.04.15.589472
代码:https://github.com/theislab/nicheformer
数据: spatial transcriptomics ,110 million
下游任务:空间标签预测、空间细胞类型组成
34.LangCell
[arXiv]
https://arxiv.org/abs/2405.06708
代码:https://github.com/PharMolix/LangCell
数据:scRNA-seq,27.5 million
下游任务:细胞鉴定
35.SpaFormer
[advance science]
https://doi.org/10.48550/arXiv.2302.03038
代码:https://github.com/wehos/CellT
数据:Spatial transcriptomics
下游任务:基因表达插值,细胞聚类
36.scFormer
[OpenReview.net]
https://openreview.net/forum?id=7hdmA0qtr5
代码:https://github.com/bowang-lab/scFormer
数据:scRNA-seq
下游任务:细胞类型注释,遗传扰动效应预测,细胞聚类
37.GPT-4
[nature methods]
https://doi.org/10.1038/s41592-024-02235-4
代码:https://github.com/Winnie09/GPTCelltype
数据:scRNA-seq
下游任务:非(条件序列生成、提示)、细胞类型注释
38.CellWhisperer
[nature methods]
https://openreview.net/forum?id=yWiZaE4k3K
代码:无
数据:Bulk/scRNA-seq
下游任务:转录组感知问答,无参考细胞特性预测(细胞类型和状态、疾病状态、细胞来源器官……)