通常情况下,单细胞转录组拿到亚群后会进行更细致的分群,或者看不同样本不同组别的内部的细胞亚群的比例变化。
这就是个性化分析阶段,这个阶段取决于自己的单细胞转录组项目课题设计情况,我们了解到的各式各样的分析点,并不是通用的。比如如果要比较细胞亚群比例,就必须要有多个样本,如果是单个样本,其分析的内容及方法也会不尽相同,类似的问题以及涉及到的分析很多。
今天小编带大家来学习下“单细胞个性化分析之细胞亚群继续分群”,相信大多数单细胞转录组的小伙伴都能用到。
细胞亚群进一步划分原则
理论上细胞亚群是可以无限划分的,因为世界上没有两个一模一样的细胞,关键是要把握一个度,什么样的差异可以判定为不同细胞亚群,什么样的差异是可以容忍的细胞类群内部异质性。有一个策略就是找出主要因素和次要因素。主要因素划分为主要亚群,比如外周血里面的T,B细胞当然是不同亚群,但是T细胞里面还可以继续划分:CD4或者CD8的T细胞,甚至继续划分, 如下图所示:
亚群细化分析实操
加载R包
{
library(Seurat)
library(dplyr)
library(reticulate)
library(sctransform)
library(cowplot)
library(ggplot2)
library(viridis)
library(tidyr)
library(magrittr)
library(reshape2)
library(readxl)
library(stringr)
library(cowplot)
library(scales)
library(readr)
library(progeny)
library(gplots)
library(tibble)
library(grid)
library(rlang)
theme_set(theme_cowplot())
use_colors <- c(
Alveolar_Macrophages = "#6bAEd6",
`Monocyte-derived macrophages`= "#fff500",
Monocytes= "#FA9FB5",
`Myeloid dendritic cells`= "#DD3497",
Plasmacytoid_dendritic_cells= "#7A0177",
`Conventional_T_cells`= "#c2e699",
`Regulatory T cells`= "#006837",
`CD8+_T_cells`= "#bcbddc",
NK_cells= "#4a1486",
B_cells= "#969696",
Plasma_cells= "#636363")
}
imm_anno <- readRDS("imm_anno.RDS")
imm_anno@meta.data$cell_type_imm <-
ordered(
imm_anno@meta.data$cell_type_imm, levels = c("Alveolar_Macrophages",
"Monocyte-derived macrophages",
"Monocytes","Myeloid dendritic cells",
"Plasmacytoid_dendritic_cells",
"Conventional_T_cells",
"Regulatory T cells",
"CD8+_T_cells",
"NK_cells",
"B_cells",
"Plasma_cells")
)
DimPlot(imm_anno, reduction = 'umap', split.by = 'tissue_type', label = T, cols = use_colors)
##产生聚类点图
细化分析的开端是发现(感觉)某群细胞有多个亚群,如下面的图发现单核来源巨噬有多群。
对巨噬细胞分析M1,M2表型通路
### 实际上同一群细胞内部小亚群比较更有意思
###导入M1
m1m2_pws <- read_lines("./data/CLASSICAL_M1_VS_ALTERNATIVE_M2_MACROPHAGE_UP.gmt") %>%
lapply(str_split, "\\t") %>%
unlist(recursive = F) %>%
lapply(function(x) setNames(list(x[-c(1:2)]), x[1])) %>%
unlist(recursive = F)
### 导入M2
m1m2_pws <- append(m1m2_pws, read_lines("./data/CLASSICAL_M1_VS_ALTERNATIVE_M2_MACROPHAGE_DN.gmt") %>%
lapply(str_split, "\\t") %>%
unlist(recursive = F) %>%
lapply(function(x) setNames(list(x[-c(1:2)]), x[1])) %>%
unlist(recursive = F))
### 运用addmodulescore函数,加入富集分数到metadata中
imm_anno <- AddModuleScore(object = imm_anno, features = m1m2_pws, name = c("m1up", "m1dn"), nbin = 12)### 提出单核来源巨噬来分析macrophage(mdm)
scRNA_mdm <- subset(imm_anno, subset = cell_type_imm %in% c("Monocyte-derived macrophages"))
scRNA_mdm=ScaleData(scRNA_mdm)
### 重新降维聚类
scRNA_mdm <- RunPCA(scRNA_mdm)
ElbowPlot(scRNA_mdm, ndims = 50)
library(ggplot2)
scRNA_mdm<- RunUMAP(scRNA_mdm, dims = 1:20)
scRNA_mdm <- FindNeighbors(scRNA_mdm, dims = 1:20)
for (i in c(0.01,0.2, 0.3, 0.4, 0.5, 1)) {
scRNA_mdm <- FindClusters(scRNA_mdm, resolution = i)
print(DimPlot(scRNA_mdm, reduction = "umap") + labs(title = paste0("resolution: ", i)))
}
### cluster数自己决定,不要太多即可,否则相似的群多
table(scRNA_mdm$SCT_snn_res.0.2)
Idents(scRNA_mdm) <-scRNA_mdm@meta.data$SCT_snn_res.0.2
DimPlot(scRNA_mdm,reduction = 'umap',group.by = 'SCT_snn_res.0.2',label = T,split.by = 'tissue_type')
ggsave2(filename = 'Macrophage亚群.pdf',path = './result_3v3/analysis',width = 8,height = 4)
### 不同mdm小亚群比较通路
VlnPlot(scRNA_mdm, features = c("m1up1", "m1dn2"),
group.by = "SCT_snn_res.0.2", pt.size = 0,
cols = use_colors)
### 第四群有一个显著的M1活化
ggsave2(filename = 'M1M2巨噬细胞细分.pdf',path = './result_3v3/analysis/',width = 8,height =4)
### 单核细胞来源的巨噬细胞间具有较大异质性,所以考虑再分别注释
mdm_markers <- FindAllMarkers(scRNA_mdm, only.pos = T, min.pct = 0.25, min.diff.pct = 0.25)
# group_by和top_n组合
top_mdm_markers <- mdm_markers %>% group_by(cluster) %>% top_n(10, wt = avg_log2FC)
DoHeatmap(scRNA_mdm,features=top_mdm_markers$gene,group.by = 'SCT_snn_res.0.2')+scale_fill_viridis()
ggsave2(filename = '热图细分.pdf',path = './result_3v3/analysis/',width = 10,height =9)
#### 第4群CCL5+,其实还有CD8A+,大家认为,这是一群新的巨噬,还是由于细胞污染呢~
FeaturePlot(scRNA_mdm,features = 'CCL5',cols = viridis(10),label = T)
ggsave2(filename = 'CCL5_mdm.pdf',path = './result_3v3/analysis/',width = 5,height =4.5)
### 据此我们可以继续定义细胞亚群,例如cluster1为C1Q+的巨噬细胞!
亚群可视化
亚群细分好后,通常我们期望能直观展示分析结果,下面给大家展示2种常见的可视化方式。
某些文章里面会把主要和次要细胞亚群同一个tSNE图展现,实际上,细胞二维散点图,是没办法写全部细胞亚群的生物学功能定义的, 通常也是把主要细胞亚群标记上去。然后每个细胞亚群再进行继续分群。只不过是在同一个散点图上面展示(如下图所示)。
但上面这种展示形式展示的细胞数量太多,一般人很难看出来不同细胞亚群内部具体如何划分子亚群,以及不同亚群到底以什么标记基因来进行区分。有一个策略是,把标记基因的表达量在所有亚群及子亚群里面展现,以热图形式,如下图: