SCS【39】单细胞转录组之降维散点图的美化 (SCpubr)

最新推荐文章于 2024-08-24 10:06:53 发布

桓峰基因

最新推荐文章于 2024-08-24 10:06:53 发布

阅读量1.2k

点赞数 23

本文链接：https://blog.csdn.net/weixin_41368414/article/details/135812833

版权

我们在使用Seurat软件包时，往往发现在绘制降温散点图时不能满足自己的对图的细致的修改，以达到美观，因此，今天介绍一个R软件包SCpubr可以任意修改绘图参数，达到自己的修改预期，下面介绍降温散点图的修改参数方式。

简介

单细胞转录组分析已成为一种广泛的技术选择时从转录组学的角度来理解异质基因的差异样本。因此，已经发布了大量的分析工具来解决这个问题从计数矩阵生成到下游分析的不同分析步骤。许多它们中的一些提供了生成数据可视化的方法。而一些设计选择，通常的做法是为用户提供原始的可视化这样就可以根据用户的需要进行定制。然而，在很多情况下最后的定制步骤要么很耗时，要么需要一组非常具体的技能。SCpubr解决了这个问题，它牺牲了一些最初的自由在美学上的选择为用户提供了一种更加流线型的生成方式高质量单细胞转录组可视化。

软件包安装

这里可以选择两种方式进行安装，install.packages() 和 devtools::install_github()

# From CRAN - Official release:
install.packages("SCpubr")

# From GitHub - Latest stable development version:
if(!requireNamespace("devtools", quietly = TRUE)){
  install.packages("devtools") # If not installed.
}

devtools::install_github("enblacar/SCpubr", ref = "v2.0.0-dev-stable")

数据读取

这里使用 pbmc 原始数据集进行读取，主要需要调用 Seurat 软件包：

#SCpubr::package_report(startup = TRUE,
 #                      extended = TRUE)
options("SCpubr.ColorPaletteEnds" = FALSE)
suppressPackageStartupMessages(library(SCpubr))

library(SCpubr)

counts_path <- "./filtered_gene_bc_matrices/hg19"

# Path count matrix.
counts <- Seurat::Read10X(counts_path)

# Create Seurat object.
sample <- Seurat::CreateSeuratObject(counts = counts, project = "10K_pbmc")

对单细胞数据进行标准化处理：

# Compute percentage of mithochondrial RNA.
sample <- Seurat::PercentageFeatureSet(sample, pattern = "^MT-", col.name = "percent.mt")

# Compute QC.
mask1 <- sample$nCount_RNA >= 1000
mask2 <- sample$nFeature_RNA >= 500
mask3 <- sample$percent.mt <= 20
mask <- mask1 & mask2 & mask3
sample <- sample[, mask]

# Normalize.
sample <- Seurat::SCTransform(sample)

# Dimensional reduction.
sample <- Seurat::RunPCA(sample)

sample <- Seurat::RunUMAP(sample, dims = 1:30)

# Find clusters.
sample <- Seurat::FindNeighbors(sample, dims = 1:30)

sample <- Seurat::FindClusters(sample, resolution = 0.2)

降维散点图

在单细胞实验中，降维图(DimPlots)是一种高度可识别的可视化方法。它们允许用户可视化降维嵌入中的单细胞，例如PCA或UMAP。用户可以根据任何所需的组对单细胞进行着色，从而在降维嵌入中对单细胞上的任何类型的分类数据进行可视化。

Basic usage

dimplot可以在SCpubr中使用SCpubr::do_DimPlot()函数生成:

SCpubr::do_DimPlot(sample = sample)

Modifying axes behavior

Bring back the Axes.

默认情况下，从绘图中删除轴和轴标题。这种行为可以用 plot.axes = TRUE:

SCpubr::do_DimPlot(sample = sample,plot.axes = TRUE)

Label the clusters

在某些情况下，我们可能希望完全删除图例，而是在每个cluster的顶部绘制标签。这可以通过使用label = TRUE来实现。

Put labels on top of the clusters.

SCpubr::do_DimPlot(sample, label = TRUE)

Labels as text

这些标签本质上是在绘图上应用ggplot2::geom_label()的结果。但是，我们也可能希望它们是纯文本而不是标签。我们可以通过提供标签来实现这一点 label.box = FALSE。

SCpubr::do_DimPlot(sample = sample,label = TRUE,label.box = FALSE)

Change the color of the label text.

但是，我们可以进一步处理函数的其他参数，如label.color将为标签和标签内的文本提供不同的颜色,label.fill:修改标签的背景。

p1 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.color = "white",
                         label.fill = "black")

# Change the color of the text.
p2 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.color = "black",
                         label.box = FALSE)
p <- p1 | p2
p

Change the size of the label text.

另外，我们可以使用 label.size 来修改标签/文本的大小。

p1 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.size = 6)
# Change the size of the text.
p2 <- SCpubr::do_DimPlot(sample = sample, 
                         label = TRUE, 
                         label.box = FALSE,
                         label.size = 6)
p <- p1 | p2
p

Changing the order of plotting

默认SCpubr::do_DimPlot()使用shuffle = TRUE随机绘制单细胞。这与Seurat::DimPlot()的默认行为不同，后者根据单细胞的身份因子水平绘制单细胞。

p1 <- SCpubr::do_DimPlot(sample = sample,
                         reduction = "pca",
                         shuffle = TRUE)

p2 <- SCpubr::do_DimPlot(sample = sample,
                         reduction = "pca",
                         shuffle = FALSE)

p <- p1 | p2
p

Highlighting cells

我们可以在图中突出显示某一组细胞。这是通过使用cells.highlight来实现的。

cells.use <- sample(x = colnames(sample), 
                    size = 1500)

p <- SCpubr::do_DimPlot(sample = sample,
                        cells.highlight = cells.use)

p

Change color of highlighted and non-highlighted cells.

还可以通过为颜色提供单一颜色来更改突出显示的colors.use，使用带有na.value的未选单细胞的颜色:

p <- SCpubr::do_DimPlot(sample = sample, 
                        cells.highlight = cells.use,
                        colors.use = "dodgerblue",
                        na.value = "grey90")
p

Increase the size of the highlighted cells.

高亮点的大小可以通过参数修改 sizes.highlight：

p <- SCpubr::do_DimPlot(sample = sample, 
                        cells.highlight = cells.use, 
                        sizes.highlight = 2)
p

Using cells.highlight.

我们也可以用idents.highlight来突出整个同一性。为此，只需提供要选择的所需标识。它还可以与cells.highlight结合使用。

p1 <- SCpubr::do_DimPlot(sample = sample, 
                         cells.highlight = cells.use)

Using idents.highlight.

p2 <- SCpubr::do_DimPlot(sample = sample, 
                         idents.highlight = c("6"))

Using both.

p3 <- SCpubr::do_DimPlot(sample = sample, 
                         cells.highlight = cells.use, 
                         idents.highlight = c("6"))

p <- p1 | p2 | p3
p

Restrict the identitites displayed

有时，我们只希望显示样本中的一些身份或组。而不是突出显示单细胞，我们仍然希望保持原来的颜色和图例。对于这个用例，只是子集的样本如下:

Subset desired identities in a DimPlot.

p <- SCpubr::do_DimPlot(sample = sample[, sample$seurat_clusters %in% c("0", "5", "2", "4")])

p

Select identities with idents.keep.

然而，我们最终失去了UMAP的轮廓。对于这个用例，SCpubr::do_DimPlot()引入了事件。idents.keep可以为其提供具有您想要保留的标识的向量。这将为其余单细胞分配NA值，并根据NA对其进行着色na.value参数:

p1 <- SCpubr::do_DimPlot(sample = sample,
                         idents.keep = c("0", "5", "2",  "4"))

Also, non-selected cells's color can be modified.

p2 <- SCpubr::do_DimPlot(sample = sample,
                         idents.keep = c("0", "5", "2",  "4"),
                         na.value = "grey50")
p <- p1 | p2
p

Group by another metadata variable

到目前为止，显示的所有dimplot都将标识显示为当前在对象中设置的标识。这可以通过使用Seurat::Idents(sample)来查询。但是，很自然地，我们可能希望显示不同的细胞数据变量。这可以通过使用group.by轻松实现。

#### Generate another metadata variable to group the cells by.
sample$annotation <- sample(c("A", "B", "C"), ncol(sample), replace = TRUE)

# Group by another metadata variable.
p1 <- SCpubr::do_DimPlot(sample, 
                         group.by = "seurat_clusters")

p2 <- SCpubr::do_DimPlot(sample, 
                         group.by = "annotation")

p <- p1 | p2
p

Splitting by a category

另一个有用的参数是split.by，它允许您将DimPlot拆分为多个面板，每个面板包含您提供给参数的元数据变量的不同唯一值。这可以理解为利用群体。通过group.by参数，然后将得到的DimPlot拆分为不同的面板。在本例中，我们将使用不同的集群作为示例。默认情况下是这样的:

#### SCpubr's DimPlot using split.by
p <- SCpubr::do_DimPlot(sample, 
                        split.by = "seurat_clusters", 
                        ncol = 5, 
                        legend.position = "none",
                        font.size = 24)

p

Using split.by and restricting the number of output plots with idents.keep.

p <- SCpubr::do_DimPlot(sample, 
                        split.by = "seurat_clusters", 
                        ncol = 3, 
                        idents.keep = c("0", "1", "6"),
                        legend.position = "none",
                        font.size = 24)

p

Group by a variable but split by another

最后，但同样重要的是，用户可能希望使用split.by拆分UMAP。同时还使用group.by对另一个变量的值进行分组(着色)。结合使用这两个参数可以得到以下结果:

#### Using split.by and group.by in combination.
sample$orig.ident <- sample(c("A", "B", "C"), 
                            ncol(sample), 
                            replace = TRUE, 
                            prob = c(0.05, 0.1, 0.85))

p <- SCpubr::do_DimPlot(sample, 
                        group.by = "seurat_clusters",
                        split.by = "orig.ident", 
                        font.size = 24)

p

Reference

Blanco-Carmona, E. Generating publication ready visualizations for Single Cell transcriptomics using SCpubr. bioRxiv (2022) doi:10.1101/2022.02.28.482303.