10X空间转录组聚类分析之BayesSpace算法聚类

最新推荐文章于 2024-05-22 21:31:45 发布

最新推荐文章于 2024-05-22 21:31:45 发布

文章标签：算法聚类机器学习空间转录组

图1. BayesSpace基本原理，

图2. BayesSpace性能评估及比较

图3. BayesSpace鉴定出黑色素瘤样本中的肿瘤近端淋巴组织结构

图4. BayesSpace区分浸润性导管癌的瘤内异质性

sce <- readVisium("path/to/spaceranger/outs/")

melanoma <- getRDS(dataset="2018_thrane_melanoma", sample="ST_mel1_rep2")

library(Matrix)

rowData <- read.csv("path/to/rowData.csv", stringsAsFactors=FALSE)
colData <- read.csv("path/to/colData.csv", stringsAsFactors=FALSE, row.names=1)
counts <- read.csv("path/to/counts.csv.gz",
                   row.names=1, check.names=F, stringsAsFactors=FALSE))

sce <- SingleCellExperiment(assays=list(counts=as(counts, "dgCMatrix")),
                            rowData=rowData,
                            colData=colData)

set.seed(102)
melanoma <- spatialPreprocess(melanoma, platform="ST", 
                              n.PCs=7, n.HVGs=2000, log.normalize=FALSE)

qTune() 为 q 的多个指定值（默认为 3 到 7）运行 BayesSpace 聚类算法，并计算它们的平均伪对数似然。它接受spatialCluster() 的任何参数。
qPlot() 将伪对数似然绘制为 q 的函数；我们建议在该图的肘部周围选择一个 q。

melanoma <- qTune(melanoma, qs=seq(2, 10), platform="ST", d=7)
qPlot(melanoma)

set.seed(149)
melanoma <- spatialCluster(melanoma, q=4, platform="ST", d=7,
                           init.method="mclust", model="t", gamma=2,
                           nrep=1000, burn.in=100,
                           save.chain=TRUE)

head(colData(melanoma))
#> DataFrame with 6 rows and 5 columns
#>            row       col sizeFactor cluster.init spatial.cluster
#>      <integer> <integer>  <numeric>    <numeric>       <numeric>
#> 7x15         7        15   0.795588            1               1
#> 7x16         7        16   0.307304            1               1
#> 7x17         7        17   0.331247            2               2
#> 7x18         7        18   0.420747            3               2
#> 8x13         8        13   0.255453            1               1
#> 8x14         8        14   1.473439            1               1

clusterPlot(melanoma)

clusterPlot(melanoma, palette=c("purple", "red", "blue", "yellow"), color="black") +
  theme_bw() +
  xlab("Column") +
  ylab("Row") +
  labs(fill="BayesSpace\ncluster", title="Spatial clustering of ST_mel1_rep2")

melanoma.enhanced <- spatialEnhance(melanoma, q=4, platform="ST", d=7,
                                    model="t", gamma=2,
                                    jitter_prior=0.3, jitter_scale=3.5,
                                    nrep=1000, burn.in=100,
                                    save.chain=TRUE)

clusterPlot(melanoma.enhanced)

markers <- c("PMEL", "CD2", "CD19", "COL1A1")
melanoma.enhanced <- enhanceFeatures(melanoma.enhanced, melanoma,
                                     feature_names=markers,
                                     nrounds=0)

logcounts(melanoma.enhanced)[markers, 1:5]
#>        subspot_1.1 subspot_2.1 subspot_3.1 subspot_4.1 subspot_5.1
#> PMEL     2.6428437   1.8550344   2.4704804   3.1827958   2.3572879
#> CD2      0.3489273   0.6066852   0.2315192   0.2210583   0.3489273
#> CD19     0.6170074   0.6528370   0.4164957   0.2558892   0.6259792
#> COL1A1   0.0000000   2.9053805   1.1940085   0.1023711   0.8157212

rowData(melanoma.enhanced)[markers, ]
#> DataFrame with 4 rows and 4 columns
#>                gene_id   gene_name    is.HVG enhanceFeatures.rmse
#>            <character> <character> <logical>            <numeric>
#> PMEL   ENSG00000185664        PMEL      TRUE             0.804628
#> CD2    ENSG00000116824         CD2      TRUE             0.614575
#> CD19   ENSG00000177455        CD19      TRUE             0.697328
#> COL1A1 ENSG00000108821      COL1A1      TRUE             0.704845

featurePlot(melanoma.enhanced, "PMEL")

enhanced.plots <- purrr::map(markers, function(x) featurePlot(melanoma.enhanced, x))
patchwork::wrap_plots(enhanced.plots, ncol=2)

spot.plots <- purrr::map(markers, function(x) featurePlot(melanoma, x))
patchwork::wrap_plots(c(enhanced.plots, spot.plots), ncol=4)

生活很好，有你更好

24
点赞
踩
21

收藏

觉得还不错? 一键收藏
0
评论
10X空间转录组聚类分析之BayesSpace算法聚类

10X空间转录组聚类分析之BayesSpace算法聚类
复制链接

扫一扫

10X空间转录组聚类分析之BayesSpace算法聚类

主要研究内容

BayesSpace基本原理

BayesSpace空间聚类提高了对脑组织中已知皮层的认识

BayesSpace能够鉴定出易被其他方法遗漏的组织结构

BayesSpace能够区分浸润性导管癌的瘤内异质性

研究总结

我们来看看代码

准备数据

BayesSpace supports three ways of loading a SingleCellExperiment for analysis.（这个是R语言常见的数据结构）

其次，为 BayesSpace manuscript分析的所有数据集都可以通过 getRDS() 函数轻松访问。 这个函数有两个参数——数据集的名称和数据集中样本的名称。

前处理

BayesSpace 需要最少的数据预处理，提供了一个辅助函数来自动化它。

在这里，省略了对数归一化，因为所有通过 getRDS() 可用的数据集都已经包含对数归一化计数。

聚类

Selecting the number of clusters

我们可以使用 qTune() 和 qPlot() 函数来帮助选择 q，即我们分析中要使用的聚类数。

Clustering with BayesSpace

spatialCluster() 函数对spot进行聚类，并将预测的聚类标签添加到SingleCellExperiment。 通常，建议至少运行 10,000 次迭代 (nrep=10000)，但为了运行示例中使用了 1,000 次迭代。 （注意必须设置随机种子才能使结果可重现。）

mclust 初始化 (cluster.init) 和 BayesSpace 集群分配 (spatial.cluster) 现在都可以在 SingleCellExperiment 的 colData 中使用。

可视化

由于 clusterPlot() 返回一个 ggplot 对象，因此可以通过组合熟悉的 ggplot2 函数对其进行自定义。 此外，参数调色板设置用于每个簇的颜色，clusterPlot() 将附加参数传递给 geom_polygon()，例如大小或颜色，以控制斑点边界。

Enhanced resolution（加强精度）

Clustering at enhanced resolution

The enhanced SingleCellExperiment includes an index to the parent spot in the original sce (spot.idx), along with an index to the subspot. It adds the offsets to the original spot coordinates, and provides the enhanced cluster label (spatial.cluster).

Enhancing the resolution of gene expression

EnhanceFeatures() 可用于估算所有基因或感兴趣基因子集的subspot水平表达。在这里，我们将通过增强四种标记基因的表达来证明：PMEL（黑色素瘤）、CD2（T 细胞）、CD19（B 细胞）和 COL1A1（成纤维细胞）。

By default, log-normalized expression (logcounts(sce)) is imputed, although other assays or arbitrary feature matrices can be specified.

Diagnostic measures from each predictive model, such as rmse when using xgboost, are added to the rowData of the enhanced dataset.

可视化enhanced gene expression

方法真的不错，最后，大家周末愉快

“相关推荐”对你有帮助么？

BayesSpace supports three ways of loading a `SingleCellExperiment` for analysis.（这个是R语言常见的数据结构）

其次，为 BayesSpace manuscript分析的所有数据集都可以通过 getRDS() 函数轻松访问。这个函数有两个参数——数据集的名称和数据集中样本的名称。

spatialCluster() 函数对spot进行聚类，并将预测的聚类标签添加到SingleCellExperiment。通常，建议至少运行 10,000 次迭代 (nrep=10000)，但为了运行示例中使用了 1,000 次迭代。（注意必须设置随机种子才能使结果可重现。）

由于 clusterPlot() 返回一个 ggplot 对象，因此可以通过组合熟悉的 ggplot2 函数对其进行自定义。此外，参数调色板设置用于每个簇的颜色，clusterPlot() 将附加参数传递给 geom_polygon()，例如大小或颜色，以控制斑点边界。