FigDraw 19. SCI文章中绘图之坡度图（Slope Chart）

最新推荐文章于 2022-10-07 16:51:33 发布

桓峰基因

最新推荐文章于 2022-10-07 16:51:33 发布

阅读量518

点赞数 1

分类专栏： SCI 文章绘图文章标签： r语言数据挖掘机器学习

本文链接：https://blog.csdn.net/weixin_41368414/article/details/126692185

版权

SCI 文章绘图专栏收录该内容

27 篇文章 42 订阅

订阅专栏

点击关注，桓峰基因

桓峰基因公众号推出基于R语言绘图教程并配有视频在线教程，目前整理出来的教程目录如下：

FigDraw 1. SCI 文章的灵魂之简约优雅的图表配色
 FigDraw 2. SCI 文章绘图必备 R 语言基础
 FigDraw 3. SCI 文章绘图必备 R 数据转换
 FigDraw 4. SCI 文章绘图之散点图（Scatter）
FigDraw 5. SCI 文章绘图之柱状图 (Barplot)
FigDraw 6. SCI 文章绘图之箱线图 (Boxplot)

FigDraw 7. SCI 文章绘图之折线图 (Lineplot)

FigDraw 8. SCI 文章绘图之饼图 (Pieplot)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)

FigDraw 10. SCI 文章绘图之直方图 (HistogramPlot)

FigDraw 11. SCI 文章绘图之小提琴图 (ViolinPlot)

FigDraw 12. SCI 文章绘图之相关性矩阵图（Correlation Matrix）

FigDraw 13. SCI 文章绘图之桑葚图及文章复现（Sankey）

FigDraw 14. SCI 文章绘图之和弦图及文章复现（Chord Diagram)

FigDraw 15. SCI 文章绘图之多组学圈图(OmicCircos)

FigDraw 16. SCI 文章绘图之树形图(Dendrogram)

FigDraw 17. SCI 文章绘图之主成分绘图(pca3d)

FigDraw 18. SCI 文章绘图之矩形树状图 (treemap)

FigDraw 19. SCI 文章中绘图之坡度图（Slope Chart）

前言

坡度图是一个伟大的工具，您想要可视化的变化的价值和排名之间的类别。这更适用于时间点很少的时间序列。

坡度图(Slope Chart)可以高效地可视化。同一个核心指标随着时间推移的变化情况。

软件安装

目前，还没有现成的构建函数来绘制坡度图。我们可以利用gglot2 及扩展包来解决这个问题。简单安装一个ggalt 软件包：

if(!require(ggalt))
  install.packages("ggalt")

数据读取

这里我们选取《R语言数据可视化之美》这本书里面的两个例子，以及来之r-statistics 的一个例子。

1. 两年份对比

两年份对比结果可视化，如下：

library(ggplot2)
library(scales)
library(reshape)
#--------------------------------------(a)两年份对比---------------------------------------------------------------

df1 <- read.csv("Slopecharts_Data1.csv")
colnames(df1) <- c("continent", "1952", "1957")
left_label <- paste(df1$continent, round(df1$`1952`), sep = ", ")
right_label <- paste(df1$continent, round(df1$`1957`), sep = ", ")
df1$class <- ifelse((df1$`1957` - df1$`1952`) < 0, "red", "green")

head(df1)
##    continent 1952 1957 class
## 1  Argentina   67   74 green
## 2 Bangladesh   54   53   red
## 3     Brazil   62   68 green
## 4     Canada   73   80 green
## 5      China   68   72 green
## 6      Egypt   60   61 green

2. 多年份对比

多年份结果可视化比较结果：

library(ggalt)

df2 <- read.csv("Slopecharts_Data2.csv")
colnames(df2) <- c("continent", 2007:2013)


dfm <- melt(df2, id = "continent")

dfm$value <- as.numeric(dfm$value)
dfm$variable <- as.numeric(dfm$variable)

left_label <- paste(dfm$continent, round(dfm$value), sep = ", ")
right_label <- paste(dfm$continent, round(dfm$value), sep = ", ")

left_point <- dfm$value
right_point <- dfm$value
class <- dfm$variable

for (i in1:nrow(dfm)) {
    if (dfm$variable[i] != 1) {
        left_label[i] <- ""
        left_point[i] <- NaN
    }
    if (dfm$variable[i] != 7) {
        right_label[i] <- ""
        right_point[i] <- NaN
    }

    if (df2[df2$continent == dfm$continent[i], 2] > df2[df2$continent == dfm$continent[i],
        8]) {
        class[i] <- "green"
    } else {
        class[i] <- "red"
    }

}

head(dfm)
##        continent variable   value
## 1        Germany        1 2428500
## 2 United Kingdom        1 2054238
## 3         France        1 1886792
## 4          Italy        1 1554199
## 5          Spain        1 1053161
## 6    Netherlands        1  571773

3. 癌症的生存比例

分析癌症的生存情况：

library(dplyr)

source_df <- read.csv("cancer_survival_rates.csv")
head(source_df)
##                              group year value
## 1                      Oral cavity    5  56.7
## 2                       Oesophagus    5  14.2
## 3                          Stomach    5  23.8
## 4                            Colon    5  61.7
## 5                           Rectum    5  62.6
## 6 Liver and intrahepatic bile duct    5   7.5
# Define functions. Source: https://github.com/jkeirstead/r-slopegraph
tufte_sort <- function(df, x = "year", y = "value", group = "group", method = "tufte",
    min.space = 0.05) {
    ## First rename the columns for consistency
    ids <- match(c(x, y, group), names(df))
    df <- df[, ids]
    names(df) <- c("x", "y", "group")

    ## Expand grid to ensure every combination has a defined value
    tmp <- expand.grid(x = unique(df$x), group = unique(df$group))
    tmp <- merge(df, tmp, all.y = TRUE)
    df <- mutate(tmp, y = ifelse(is.na(y), 0, y))

    ## Cast into a matrix shape and arrange by first column
    require(reshape2)
    tmp <- dcast(df, group ~ x, value.var = "y")
    ord <- order(tmp[, 2])
    tmp <- tmp[ord, ]

    min.space <- min.space * diff(range(tmp[, -1]))
    yshift <- numeric(nrow(tmp))
    ## Start at 'bottom' row Repeat for rest of the rows until you hit the top
    for (i in2:nrow(tmp)) {
        ## Shift subsequent row up by equal space so gap between two entries is
        ## >= minimum
        mat <- as.matrix(tmp[(i - 1):i, -1])
        d.min <- min(diff(mat))
        yshift[i] <- ifelse(d.min < min.space, min.space - d.min, 0)
    }

    tmp <- cbind(tmp, yshift = cumsum(yshift))

    scale <- 1
    tmp <- melt(tmp, id = c("group", "yshift"), variable.name = "x", value.name = "y")
    ## Store these gaps in a separate variable so that they can be scaled ypos
    ## = a*yshift + y

    tmp <- transform(tmp, ypos = y + scale * yshift)
    return(tmp)

}

例子实操

数据我们准备好之后，就可以绘图了，因为没有现成的R软件包，所有我们需要利用ggplot2中的函数进行组合绘制坡度图。

1. 两年份对比

p <- ggplot(df1) + 
  geom_segment(aes(x=1, xend=2, y=`1952`, yend=`1957`, col=class), size=.75, show.legend=F) +  #连接线
  geom_vline(xintercept=1, linetype="solid", size=.1) + # 1952年的垂直直线
  geom_vline(xintercept=2, linetype="solid", size=.1) + # 1957年的垂直直线
  geom_point(aes(x=1, y=`1952`), size=3,shape=21,fill="grey80",color="black") + # 1952年的数据点
  geom_point(aes(x=2, y=`1957`), size=3,shape=21,fill="grey80",color="black") + # 1957年的数据点
  scale_color_manual(labels = c("Up", "Down"), values = c("green"="#A6D854","red"="#FC4E07")) +  
  xlim(.5, 2.5) 
p
# 添加文本信息
p <- p + geom_text(label=left_label, y=df1$`1952`, x=rep(1, NROW(df1)), hjust=1.1, size=3.5)
p <- p + geom_text(label=right_label, y=df1$`1957`, x=rep(2, NROW(df1)), hjust=-0.1, size=3.5)
p <- p + geom_text(label="1952", x=1, y=1.02*(max(df1$`1952`, df1$`1957`)), hjust=1.2, size=5)   
p <- p + geom_text(label="1957", x=2, y=1.02*(max(df1$`1952`, df1$`1957`)), hjust=-0.1, size=5) 

p<-p+theme_void()
p

2. 多年份对比

p <- ggplot(dfm) + geom_xspline(aes(x = variable, y = value, group = continent, colour = class),
    size = 0.75) + geom_vline(xintercept = 1, linetype = "solid", size = 0.1) + geom_vline(xintercept = 7,
    linetype = "solid", size = 0.1) + geom_point(aes(x = variable, y = left_point),
    size = 3, shape = 21, fill = "grey80", color = "black") + geom_point(aes(x = variable,
    y = right_point), size = 3, shape = 21, fill = "grey80", color = "black") + scale_color_manual(labels = c("Up",
    "Down"), values = c(green = "#FC4E07", red = "#A6D854")) + xlim(-4, 12)

p <- p + geom_text(label = left_label, y = dfm$value, x = rep(1, NROW(dfm)), hjust = 1.1,
    size = 3.5)
p <- p + geom_text(label = right_label, y = dfm$value, x = rep(7, NROW(dfm)), hjust = -0.1,
    size = 3.5)
p <- p + geom_text(label = "2007", x = 1, y = 1.02 * (max(df2$value)), hjust = 1.2,
    size = 5)  # title
p <- p + geom_text(label = "2013", x = 7, y = 1.02 * (max(df2$value)), hjust = -0.1,
    size = 5)  # title

3. 癌症的生存比例

## Plot
plot_slopegraph(df) + labs(title = "Estimates of % survival rates") + theme(axis.title = element_blank(),
    axis.ticks = element_blank(), plot.title = element_text(hjust = 0.5, family = "American Typewriter",
        face = "bold"), axis.text = element_text(family = "American Typewriter",
        face = "bold")) + theme_classic()