FigDraw 19. SCI文章中绘图之坡度图(Slope Chart)

图片

点击关注,桓峰基因

桓峰基因公众号推出基于R语言绘图教程并配有视频在线教程,目前整理出来的教程目录如下:

FigDraw 1. SCI 文章的灵魂 之 简约优雅的图表配色
FigDraw 2. SCI 文章绘图必备 R 语言基础
FigDraw 3. SCI 文章绘图必备 R 数据转换
FigDraw 4. SCI 文章绘图之散点图 (Scatter)
FigDraw 5. SCI 文章绘图之柱状图 (Barplot)
FigDraw 6. SCI 文章绘图之箱线图 (Boxplot)

FigDraw 7. SCI 文章绘图之折线图 (Lineplot)

FigDraw 8. SCI 文章绘图之饼图 (Pieplot)

FigDraw 9. SCI 文章绘图之韦恩图 (Vennplot)

FigDraw 10. SCI 文章绘图之直方图 (HistogramPlot)

FigDraw 11. SCI 文章绘图之小提琴图 (ViolinPlot)

FigDraw 12. SCI 文章绘图之相关性矩阵图(Correlation Matrix)

FigDraw 13. SCI 文章绘图之桑葚图及文章复现(Sankey)

FigDraw 14. SCI 文章绘图之和弦图及文章复现(Chord Diagram)

FigDraw 15. SCI 文章绘图之多组学圈图(OmicCircos)

FigDraw 16. SCI 文章绘图之树形图(Dendrogram)

FigDraw 17. SCI 文章绘图之主成分绘图(pca3d)

FigDraw 18. SCI 文章绘图之矩形树状图 (treemap)

FigDraw 19. SCI 文章中绘图之坡度图(Slope Chart)


前 言

坡度图是一个伟大的工具,您想要可视化的变化的价值和排名之间的类别。这更适用于时间点很少的时间序列。

图片

坡度图(Slope Chart)可以高效地可视化。同一个核心指标随着时间推移的变化情况。

图片

软件安装

目前,还没有现成的构建函数来绘制坡度图。我们可以利用gglot2 及扩展包来解决这个问题。简单安装一个ggalt 软件包:

if(!require(ggalt))
  install.packages("ggalt")

数据读取

这里我们选取《R语言数据可视化之美》这本书里面的两个例子,以及来之r-statistics 的一个例子。

1. 两年份对比

两年份对比结果可视化,如下:

library(ggplot2)
library(scales)
library(reshape)
#--------------------------------------(a)两年份对比---------------------------------------------------------------

df1 <- read.csv("Slopecharts_Data1.csv")
colnames(df1) <- c("continent", "1952", "1957")
left_label <- paste(df1$continent, round(df1$`1952`), sep = ", ")
right_label <- paste(df1$continent, round(df1$`1957`), sep = ", ")
df1$class <- ifelse((df1$`1957` - df1$`1952`) < 0, "red", "green")

head(df1)
##    continent 1952 1957 class
## 1  Argentina   67   74 green
## 2 Bangladesh   54   53   red
## 3     Brazil   62   68 green
## 4     Canada   73   80 green
## 5      China   68   72 green
## 6      Egypt   60   61 green

2. 多年份对比

多年份结果可视化比较结果:

library(ggalt)

df2 <- read.csv("Slopecharts_Data2.csv")
colnames(df2) <- c("continent", 2007:2013)


dfm <- melt(df2, id = "continent")

dfm$value <- as.numeric(dfm$value)
dfm$variable <- as.numeric(dfm$variable)

left_label <- paste(dfm$continent, round(dfm$value), sep = ", ")
right_label <- paste(dfm$continent, round(dfm$value), sep = ", ")

left_point <- dfm$value
right_point <- dfm$value
class <- dfm$variable

for (i in1:nrow(dfm)) {
    if (dfm$variable[i] != 1) {
        left_label[i] <- ""
        left_point[i] <- NaN
    }
    if (dfm$variable[i] != 7) {
        right_label[i] <- ""
        right_point[i] <- NaN
    }

    if (df2[df2$continent == dfm$continent[i], 2] > df2[df2$continent == dfm$continent[i],
        8]) {
        class[i] <- "green"
    } else {
        class[i] <- "red"
    }

}

head(dfm)
##        continent variable   value
## 1        Germany        1 2428500
## 2 United Kingdom        1 2054238
## 3         France        1 1886792
## 4          Italy        1 1554199
## 5          Spain        1 1053161
## 6    Netherlands        1  571773

3. 癌症的生存比例

分析癌症的生存情况:

library(dplyr)

source_df <- read.csv("cancer_survival_rates.csv")
head(source_df)
##                              group year value
## 1                      Oral cavity    5  56.7
## 2                       Oesophagus    5  14.2
## 3                          Stomach    5  23.8
## 4                            Colon    5  61.7
## 5                           Rectum    5  62.6
## 6 Liver and intrahepatic bile duct    5   7.5
# Define functions. Source: https://github.com/jkeirstead/r-slopegraph
tufte_sort <- function(df, x = "year", y = "value", group = "group", method = "tufte",
    min.space = 0.05) {
    ## First rename the columns for consistency
    ids <- match(c(x, y, group), names(df))
    df <- df[, ids]
    names(df) <- c("x", "y", "group")

    ## Expand grid to ensure every combination has a defined value
    tmp <- expand.grid(x = unique(df$x), group = unique(df$group))
    tmp <- merge(df, tmp, all.y = TRUE)
    df <- mutate(tmp, y = ifelse(is.na(y), 0, y))

    ## Cast into a matrix shape and arrange by first column
    require(reshape2)
    tmp <- dcast(df, group ~ x, value.var = "y")
    ord <- order(tmp[, 2])
    tmp <- tmp[ord, ]

    min.space <- min.space * diff(range(tmp[, -1]))
    yshift <- numeric(nrow(tmp))
    ## Start at 'bottom' row Repeat for rest of the rows until you hit the top
    for (i in2:nrow(tmp)) {
        ## Shift subsequent row up by equal space so gap between two entries is
        ## >= minimum
        mat <- as.matrix(tmp[(i - 1):i, -1])
        d.min <- min(diff(mat))
        yshift[i] <- ifelse(d.min < min.space, min.space - d.min, 0)
    }

    tmp <- cbind(tmp, yshift = cumsum(yshift))

    scale <- 1
    tmp <- melt(tmp, id = c("group", "yshift"), variable.name = "x", value.name = "y")
    ## Store these gaps in a separate variable so that they can be scaled ypos
    ## = a*yshift + y

    tmp <- transform(tmp, ypos = y + scale * yshift)
    return(tmp)

}

例子实操

数据我们准备好之后,就可以绘图了,因为没有现成的R软件包,所有我们需要利用ggplot2中的函数进行组合绘制坡度图。

1. 两年份对比

p <- ggplot(df1) + 
  geom_segment(aes(x=1, xend=2, y=`1952`, yend=`1957`, col=class), size=.75, show.legend=F) +  #连接线
  geom_vline(xintercept=1, linetype="solid", size=.1) + # 1952年的垂直直线
  geom_vline(xintercept=2, linetype="solid", size=.1) + # 1957年的垂直直线
  geom_point(aes(x=1, y=`1952`), size=3,shape=21,fill="grey80",color="black") + # 1952年的数据点
  geom_point(aes(x=2, y=`1957`), size=3,shape=21,fill="grey80",color="black") + # 1957年的数据点
  scale_color_manual(labels = c("Up", "Down"), values = c("green"="#A6D854","red"="#FC4E07")) +  
  xlim(.5, 2.5) 
p
# 添加文本信息
p <- p + geom_text(label=left_label, y=df1$`1952`, x=rep(1, NROW(df1)), hjust=1.1, size=3.5)
p <- p + geom_text(label=right_label, y=df1$`1957`, x=rep(2, NROW(df1)), hjust=-0.1, size=3.5)
p <- p + geom_text(label="1952", x=1, y=1.02*(max(df1$`1952`, df1$`1957`)), hjust=1.2, size=5)   
p <- p + geom_text(label="1957", x=2, y=1.02*(max(df1$`1952`, df1$`1957`)), hjust=-0.1, size=5) 

p<-p+theme_void()
p

图片

2. 多年份对比

p <- ggplot(dfm) + geom_xspline(aes(x = variable, y = value, group = continent, colour = class),
    size = 0.75) + geom_vline(xintercept = 1, linetype = "solid", size = 0.1) + geom_vline(xintercept = 7,
    linetype = "solid", size = 0.1) + geom_point(aes(x = variable, y = left_point),
    size = 3, shape = 21, fill = "grey80", color = "black") + geom_point(aes(x = variable,
    y = right_point), size = 3, shape = 21, fill = "grey80", color = "black") + scale_color_manual(labels = c("Up",
    "Down"), values = c(green = "#FC4E07", red = "#A6D854")) + xlim(-4, 12)

p <- p + geom_text(label = left_label, y = dfm$value, x = rep(1, NROW(dfm)), hjust = 1.1,
    size = 3.5)
p <- p + geom_text(label = right_label, y = dfm$value, x = rep(7, NROW(dfm)), hjust = -0.1,
    size = 3.5)
p <- p + geom_text(label = "2007", x = 1, y = 1.02 * (max(df2$value)), hjust = 1.2,
    size = 5)  # title
p <- p + geom_text(label = "2013", x = 7, y = 1.02 * (max(df2$value)), hjust = -0.1,
    size = 5)  # title

图片

3. 癌症的生存比例

## Plot
plot_slopegraph(df) + labs(title = "Estimates of % survival rates") + theme(axis.title = element_blank(),
    axis.ticks = element_blank(), plot.title = element_text(hjust = 0.5, family = "American Typewriter",
        face = "bold"), axis.text = element_text(family = "American Typewriter",
        face = "bold")) + theme_classic()

图片

软件包里面自带的例子,我这里都展示了一遍为了方便大家选择适合自己的图形,另外需要代码的将这期教程转发朋友圈,并配文“学生信,找桓峰基因,铸造成功的你!”即可获得!

桓峰基因,铸造成功的您!

有想进生信交流群的老师可以扫最后一个二维码加微信,备注“单位+姓名+目的”,有些想发广告的就免打扰吧,还得费力气把你踢出去!

References:
  1. 张杰. 《R语言数据可视化之美》

图片

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值