ggplot-geom_point 散点图添加聚类标签

本篇分享在ggplot的散点聚类图中,为每个类群添加标签Label的方法,方法通过kmeans方法计算每个簇的中心,给每个簇的中心点添加Class_label实现每个簇的标签标记。

点图绘图数据

> data.plot
         tissue   x   y  imagerow imagecol Class
BIN_001       1   12 22        1      001 class_A
BIN_002       1   22 21        1      002 class_A
BIN_003       1   11 32        1      003 class_B
BIN_004       1   30 14        1      004 class_C
BIN_005       1   32 45        1      005 class_A
...

1、kmeans计算每个簇的中心点坐标

计算一个簇的中心,首先根据簇类别对数据集分组,然后用kmeans计算每个簇的一个中心坐标。

library(dplyr);library(purrr);library(ggplot2)
data.plot %>% 
    group_by(Class) %>% 
    do(model = kmeans(.[c('x', 'y')], 1)) %>% ### kmeans 计算一个中心点
    ungroup() %>% group_by(Class) %>% 
    do(map_df(.$model, broom::tidy)) %>% ### 整理模型数据
    ungroup() %>% select(Class,x,y ) %>% data.frame() %>% 
    dplyr::rename(x.center=x,y.center=y,Class=Class) ->label.data

> label.data
     Class  x.center  y.center
1   class_A 13.67994  21.90958
2   class_B 28.67363  38.40217
3   class_C 16.99799  13.78242

2、画散点图,通过geom_label显示标签信息

data.plot %>% ggplot(aes(x ,y)) + 
    geom_point(aes(colour = Class),size = 0.5) + 
    scale_colour_brewer(palette = "Dark2") +
    theme_bw() +
    ggtitle("Class.cluster.plot") + theme(plot.title = element_text(face = 2,size = 50,hjust = 0.5)) + 
    geom_label(data = label.data, aes(label = Class,x = x.center,y = y.center))

  • 3
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
帮我修改以下代码,使他们可以正确运行:library(ggplot2) library(gridExtra) ggplot(df, aes(x = x, y = y, color = z)) + geom_point() p1 <- ggplot(subset(df, z == 'a'), aes(x = x, y = y)) + geom_point(color = 'black') p2 <- ggplot(subset(df, z == 'b'), aes(x = x, y = y)) + geom_point(color = 'black') p3 <- ggplot(subset(df, z == 'c'), aes(x = x, y = y)) + geom_point(color = 'black') grid.arrange(p1, p2, p3, ncol = 3) centroids <- aggregate(df[, c('x', 'y')], by = list(df$z), FUN = mean) names(centroids)[1] <- 'z' p1 <- ggplot(df, aes(x = x, y = y)) + geom_point(aes(color = z)) + geom_point(data = subset(centroids, z == 'a'), aes(x = x, y = y, color = z), size = 4) p2 <- ggplot(df, aes(x = x, y = y)) + geom_point(aes(color = z)) + geom_point(data = subset(centroids, z == 'b'), aes(x = x, y = y, color = z), size = 4) p3 <- ggplot(df, aes(x = x, y = y)) + geom_point(aes(color = z)) + geom_point(data = subset(centroids, z == 'c'), aes(x = x, y = y, color = z), size = 4) grid.arrange(p1, p2, p3, ncol = 3) p1 <- ggplot(df, aes(x = x, y = y)) + geom_point(data = subset(df, z == 'a'), aes(color = z)) + geom_point(data = subset(df, z != 'a'), color = 'grey70') p2 <- ggplot(df, aes(x = x, y = y)) + geom_point(data = subset(df, z == 'b'), aes(color = z)) + geom_point(data = subset(df, z != 'b'), color = 'grey70') p3 <- ggplot(df, aes(x = x, y = y)) + geom_point(data = subset(df, z == 'c'), aes(color = z)) + geom_point(data = subset(df, z != 'c'), color = 'grey70') grid.arrange(p1, p2, p3, ncol = 3)
06-08
在运行这段代码之前,需要先定义和载入数据框 df。如果数据框 df 中没有 x、y 和 z 这三列数据,那么这段代码将无法正确运行。因此,需要先确保 df 中包含这三列数据。 以下是修改后的代码: ``` library(ggplot2) library(gridExtra) # 绘制 ggplot(df, aes(x = x, y = y, color = z)) + geom_point() # 按照 z 的不同取值绘制三幅,并排显示 p1 <- ggplot(subset(df, z == 'a'), aes(x = x, y = y)) + geom_point(color = 'black') p2 <- ggplot(subset(df, z == 'b'), aes(x = x, y = y)) + geom_point(color = 'black') p3 <- ggplot(subset(df, z == 'c'), aes(x = x, y = y)) + geom_point(color = 'black') grid.arrange(p1, p2, p3, ncol = 3) # 计算每个 z 取值对应的 x 和 y 的均值,作为聚类中心 centroids <- aggregate(df[, c('x', 'y')], by = list(df$z), FUN = mean) names(centroids)[1] <- 'z' # 绘制带有聚类中心的 p1 <- ggplot(df, aes(x = x, y = y)) + geom_point(aes(color = z)) + geom_point(data = subset(centroids, z == 'a'), aes(x = x, y = y, color = z), size = 4) p2 <- ggplot(df, aes(x = x, y = y)) + geom_point(aes(color = z)) + geom_point(data = subset(centroids, z == 'b'), aes(x = x, y = y, color = z), size = 4) p3 <- ggplot(df, aes(x = x, y = y)) + geom_point(aes(color = z)) + geom_point(data = subset(centroids, z == 'c'), aes(x = x, y = y, color = z), size = 4) grid.arrange(p1, p2, p3, ncol = 3) # 绘制带有灰色点的,用于显示不同 z 取值之间的边界 p1 <- ggplot(df, aes(x = x, y = y)) + geom_point(data = subset(df, z == 'a'), aes(color = z)) + geom_point(data = subset(df, z != 'a'), color = 'grey70') p2 <- ggplot(df, aes(x = x, y = y)) + geom_point(data = subset(df, z == 'b'), aes(color = z)) + geom_point(data = subset(df, z != 'b'), color = 'grey70') p3 <- ggplot(df, aes(x = x, y = y)) + geom_point(data = subset(df, z == 'c'), aes(color = z)) + geom_point(data = subset(df, z != 'c'), color = 'grey70') grid.arrange(p1, p2, p3, ncol = 3) ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值