IOBR一站式免疫浸润分析R包及结果热图展示

最新推荐文章于 2024-11-25 16:43:43 发布

凑齐六个字吧

最新推荐文章于 2024-11-25 16:43:43 发布

阅读量3.9k

点赞数 14

分类专栏：分析工具转录组文章标签：数据分析

本文链接：https://blog.csdn.net/zfyyzhys/article/details/141304607

版权

分析工具同时被 2 个专栏收录

43 篇文章

订阅专栏

转录组

15 篇文章

订阅专栏

延续上一篇推文：转录组8种免疫浸润分析方法整理(https://mp.weixin.qq.com/s/sy-HT1znQYTcktN3ef-djw)

这次来整理和学习一下IOBR这个功能强大的R包

IOBR(Immuno-Oncology Biological Research，免疫肿瘤学生物研究)这个R包整合了8种已发表的方法（CIBERSORT、TIMER、xCell、MCPcounter、ESTIMATE、EPIC、IPS和quanTIseq)，用于解析肿瘤微环境的背景信息。

此外，开发者还收集了264个已发表的基因集签名，涵盖了肿瘤微环境、肿瘤代谢、m6A、外泌体、微卫星不稳定性和三级淋巴结构。通过运行signature_collection_citation函数，可以获取这些基因集签名的来源文献，而signature_collection函数则返回所有给定签名的详细基因。IOBR使用了三种计算方法来计算签名得分，包括PCA、z-score和ssGSEA。值得注意的是，IOBR还收集并使用了多种方法进行变量转换、可视化、批量生存分析、特征选择和统计分析。该包支持批量分析和结果可视化。

综合来看，IOBR是一个重点解析肿瘤微环境且不仅限于此的强大的工具R包。

这个是目前IOBR更新之后的内容，从图片中就可以看出IOBR增加了更多有趣的东西。首先是可输入的数据也从单纯普通的RNA数据进入了单细胞时代。此外增加了多种新的分析内容，包括了单细胞数据的配体和受体活性网络分析，去卷积之后的肿瘤微环境聚类分群，还有sigature分析等。

本次重点探索一下多种转录组免疫浸润分析。

分析流程

1、加载数据

rm(list = ls())
load("~/Desktop/data/combat.Rdata")
head(exp)[1:4,1:4]
#           GSM3106326 GSM3106327 GSM3106328 GSM3106329
# LINC01128   8.000049   8.044479   7.311868   7.610713
# SAMD11      7.499611   7.562370   7.412649   7.239141
# KLHL17      7.662076   7.672331   7.627330   7.564212
# PLEKHN1     7.666884   7.889537   7.640606   7.746440
head(pd)
#                                                title                  samples   GSE_num group
# GSM3106326 Pulmonary arterial hypertension patient 1   idiopathic PAH patient GSE113439   PAH
# GSM3106327 Pulmonary arterial hypertension patient 2 patient with PAH and CHD GSE113439   PAH
# GSM3106328 Pulmonary arterial hypertension patient 3 patient with PAH and CTD GSE113439   PAH
# GSM3106329 Pulmonary arterial hypertension patient 4 patient with PAH and CHD GSE113439   PAH
# GSM3106330 Pulmonary arterial hypertension patient 5   idiopathic PAH patient GSE113439   PAH
# GSM3106331 Pulmonary arterial hypertension patient 6   idiopathic PAH patient GSE113439   PAH

identical(rownames(pd),colnames(exp))
# [1] TRUE

library(IOBR)
tme_deconvolution_methods
      #  MCPcounter               EPIC              xCell          CIBERSORT CIBERSORT Absolute 
      # "mcpcounter"             "epic"            "xcell"        "cibersort"    "cibersort_abs" 
      #          IPS           ESTIMATE                SVR               lsei              TIMER 
      #        "ips"         "estimate"              "svr"             "lsei"            "timer" 
      #    quanTIseq 
      #  "quantiseq"

内置了多种免疫分析方法：mcpcounter，epic，xcell，cibersort，cibersort_abs，ips，ESTIMATE，SVR，lsei，TIMER，quanTIseq

2、cibersort

ciber_res<-deconvo_tme(eset = exp, method = "cibersort", 
                       arrays = TRUE, perm = 100 )
ciber_res
# # A tibble: 132 × 26
#    ID         B_cells_naive_CIBERSORT B_cells_memory_CIBERSORT Plasma_cells_CIBERSORT T_cells_CD8_CIBERSORT
#    <chr>                        <dbl>                    <dbl>                  <dbl>                 <dbl>
#  1 GSM3106326                  0.0142                        0                0.0698                 0.0456
#  2 GSM3106327                  0.0281                        0                0.00704                0.0218
#  3 GSM3106328                  0.0309                        0                0.0257                 0.0224
#  4 GSM3106329                  0.0335                        0                0.0355                 0.0386
#  5 GSM3106330                  0.0471                        0                0.0263                 0.0666
#  6 GSM3106331                  0.0441                        0                0.0220                 0     
#  7 GSM3106332                  0.0205                        0                0.118                  0.133 
#  8 GSM3106333                  0.0370                        0                0.0307                 0.0805
#  9 GSM3106334                  0.0537                        0                0.0501                 0.0218
# 10 GSM3106335                  0                             0                0.0210                 0.0196
# # ℹ 122 more rows
# # ℹ 21 more variables: T_cells_CD4_naive_CIBERSORT <dbl>, T_cells_CD4_memory_resting_CIBERSORT <dbl>,
# #   T_cells_CD4_memory_activated_CIBERSORT <dbl>, T_cells_follicular_helper_CIBERSORT <dbl>,
# #   `T_cells_regulatory_(Tregs)_CIBERSORT` <dbl>, T_cells_gamma_delta_CIBERSORT <dbl>,
# #   NK_cells_resting_CIBERSORT <dbl>, NK_cells_activated_CIBERSORT <dbl>, Monocytes_CIBERSORT <dbl>,
# #   Macrophages_M0_CIBERSORT <dbl>, Macrophages_M1_CIBERSORT <dbl>, Macrophages_M2_CIBERSORT <dbl>,
# #   Dendritic_cells_resting_CIBERSORT <dbl>, Dendritic_cells_activated_CIBERSORT <dbl>, …
# # ℹ Use `print(n = ...)` to see more rows

ciber_plot <- cell_bar_plot(input = ciber_res, features = colnames(ciber_res)[2:23], 
                            title = "CIBERSORT Cell Fraction")

里面内置了cell_bar_plot函数进行快速绘图

3.epic

epic_res <- deconvo_tme(eset = exp, method = "epic", arrays = TRUE)
epic_res
# # A tibble: 132 × 9
#    ID       Bcells_EPIC CAFs_EPIC CD4_Tcells_EPIC CD8_Tcells_EPIC Endothelial_EPIC Macrophages_EPIC NKcells_EPIC
#    <chr>          <dbl>     <dbl>           <dbl>           <dbl>            <dbl>            <dbl>        <dbl>
#  1 GSM3106…      0.0266   0.00939           0.145          0.0877            0.111          0.00673     2.14e-10
#  2 GSM3106…      0.0258   0.00795           0.145          0.0867            0.109          0.00684     2.01e-10
#  3 GSM3106…      0.0281   0.00793           0.145          0.0889            0.109          0.00672     1.71e-10
#  4 GSM3106…      0.0284   0.00909           0.143          0.0867            0.109          0.00664     1.63e-10
#  5 GSM3106…      0.0264   0.00799           0.144          0.0867            0.115          0.00638     1.53e- 9
#  6 GSM3106…      0.0292   0.00783           0.148          0.0855            0.108          0.00668     1.22e-10
#  7 GSM3106…      0.0261   0.00928           0.139          0.0899            0.115          0.00646     7.15e-10
#  8 GSM3106…      0.0253   0.00854           0.139          0.0894            0.106          0.00674     8.18e-10
#  9 GSM3106…      0.0325   0.00857           0.151          0.0893            0.106          0.00676     1.27e-10
# 10 GSM3106…      0.0263   0.00809           0.132          0.0841            0.106          0.00640     1.01e- 9
# # ℹ 122 more rows
# # ℹ 1 more variable: otherCells_EPIC <dbl>
# # ℹ Use `print(n = ...)` to see more rows
epic_plot <- cell_bar_plot(input = epic_res, features = colnames(epic_res)[2:9], 
                            title = "EPIC Cell Fraction")

4.quantiseq

quantiseq_res <- deconvo_tme(eset = exp, tumor = F, 
                             arrays = TRUE, scale_mrna = TRUE, method = "quantiseq")
quantiseq_res
# # A tibble: 132 × 12
#    ID         B_cells_quantiseq Macrophages_M1_quantiseq Macrophages_M2_quantiseq Monocytes_quantiseq
#    <chr>                  <dbl>                    <dbl>                    <dbl>               <dbl>
#  1 GSM3106326            0.0561                    0.195                   0.0910             0.00835
#  2 GSM3106327            0.0462                    0.417                   0.0557             0      
#  3 GSM3106328            0.0640                    0.194                   0.0830             0.0118 
#  4 GSM3106329            0.0707                    0.182                   0.142              0.0433 
#  5 GSM3106330            0.0501                    0.257                   0.0887             0      
#  6 GSM3106331            0.0480                    0.424                   0.0940             0      
#  7 GSM3106332            0.0638                    0.183                   0.0849             0.122  
#  8 GSM3106333            0.0564                    0.277                   0.0878             0      
#  9 GSM3106334            0.0713                    0.272                   0.106              0      
# 10 GSM3106335            0.0532                    0.203                   0.111              0.0201 
# # ℹ 122 more rows
# # ℹ 7 more variables: Neutrophils_quantiseq <dbl>, NK_cells_quantiseq <dbl>, T_cells_CD4_quantiseq <dbl>,
# #   T_cells_CD8_quantiseq <dbl>, Tregs_quantiseq <dbl>, Dendritic_cells_quantiseq <dbl>,
# #   Other_quantiseq <dbl>
# # ℹ Use `print(n = ...)` to see more rows
quantiseq_plot <- cell_bar_plot(input = quantiseq_res, features = colnames(quantiseq_res)[2:12], 
                            title = "Quantiseq Cell Fraction")

5. MCPcounter

mcpcounter_res <- deconvo_tme(eset = exp, method = "mcpcounter")
mcpcounter_res
# # A tibble: 132 × 11
#    ID         T_cells_MCPcounter CD8_T_cells_MCPcounter Cytotoxic_lymphocytes_MCPcoun…¹ B_lineage_MCPcounter
#    <chr>                   <dbl>                  <dbl>                           <dbl>                <dbl>
#  1 GSM3106326               7.41                   7.03                            6.92                 6.88
#  2 GSM3106327               7.65                   7.02                            7.51                 6.75
#  3 GSM3106328               8.02                   6.92                            6.55                 7.04
#  4 GSM3106329               7.55                   6.72                            6.68                 7.00
#  5 GSM3106330               7.61                   7.00                            6.56                 6.71
#  6 GSM3106331               7.92                   6.89                            6.53                 7.17
#  7 GSM3106332               7.30                   7.04                            6.75                 6.54
#  8 GSM3106333               7.50                   7.14                            6.90                 6.61
#  9 GSM3106334               8.11                   7.26                            7.05                 8.03
# 10 GSM3106335               6.88                   6.94                            6.77                 6.71
# # ℹ 122 more rows
# # ℹ abbreviated name: ¹Cytotoxic_lymphocytes_MCPcounter
# # ℹ 6 more variables: NK_cells_MCPcounter <dbl>, Monocytic_lineage_MCPcounter <dbl>,
# #   Myeloid_dendritic_cells_MCPcounter <dbl>, Neutrophils_MCPcounter <dbl>,
# #   Endothelial_cells_MCPcounter <dbl>, Fibroblasts_MCPcounter <dbl>
# # ℹ Use `print(n = ...)` to see more rows

6. xCELL

xCELL_res <- deconvo_tme(eset = exp, method = "xcell",arrays = TRUE)
# >>> Running xCell
# [1] "Num. of genes: 10365"
# Error in GSVA::gsva(expr, signatures, method = "ssgsea", ssgsea.norm = FALSE) : 
#   Calling gsva(expr=., gset.idx.list=., method=., ...) is defunct; use a method-specific parameter object (see '?gsva').

这里的xCell用到了gsva，然后gsva这个函数进行了更新，但IOBR应该还没有更新~ 如果需要分析的话需要加载xCell的原始函数分析。

7. TIMER

#这里导入的group_list必须是TIMER能够识别的，比如TCGA中33肿瘤类型
TIMER_res <- deconvo_tme(eset = exp, method = "timer",group_list = group_list)

8. estimate

estimate_res <- deconvo_tme(eset = exp, method = "estimate")
estimate_res
# # A tibble: 132 × 5
#    ID         StromalScore_estimate ImmuneScore_estimate ESTIMATEScore_estimate TumorPurity_estimate
#    <chr>                      <dbl>                <dbl>                  <dbl>                <dbl>
#  1 GSM3106326                 1164.                1925.                  3088.                0.490
#  2 GSM3106327                 1099.                2154.                  3253.                0.469
#  3 GSM3106328                 1213.                2174.                  3387.                0.452
#  4 GSM3106329                 1430.                1805.                  3236.                0.471
#  5 GSM3106330                 1009.                1659.                  2668.                0.543
#  6 GSM3106331                  879.                1854.                  2733.                0.535
#  7 GSM3106332                 1508.                1261.                  2770.                0.531
#  8 GSM3106333                 1335.                1855.                  3189.                0.477
#  9 GSM3106334                 1483.                2314.                  3797.                0.397
# 10 GSM3106335                  562.                1615.                  2177.                0.602
# # ℹ 122 more rows
# # ℹ Use `print(n = ...)` to see more rows

9. IPS

ips_res <- deconvo_tme(eset = exprSet, method = "ips", plot= F)
head(ips_res)
# # A tibble: 6 × 7
#   ID         MHC_IPS EC_IPS SC_IPS  CP_IPS AZ_IPS IPS_IPS
#   <chr>        <dbl>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
# 1 GSM3106326    1.11 -0.330    NaN -0.264   0.515       2
# 2 GSM3106327    1.16 -0.430    NaN -0.234   0.493       2
# 3 GSM3106328    1.19 -0.406    NaN -0.0991  0.680       2
# 4 GSM3106329    1.06 -0.374    NaN -0.0446  0.642       2
# 5 GSM3106330    1.16 -0.252    NaN -0.0968  0.808       3
# 6 GSM3106331    1.06 -0.348    NaN -0.0608  0.655       2

值得一提的是这个IPS，其是指immunophenotype (免疫表型评分)。里面一共评估四个主要参数分别是：MHC分子(MHC molecules，MHC) ，免疫调节分子(Immunomodulators，CP) ，效应细胞(Effector cells，EC) ，抑制细胞(Suppressor cells，SC) 。使用者可以根据得到的结果联合生存数据/临床参数分析。

如果研究内容是如下图中的这20种实体瘤数据可以直接去TCIA官网下载(链接在参考资料)。如果是TCIA中没有的癌种，或者其他的肿瘤数据可以使用IOBR预测评分。

10.1、pheatmap热图

library(pheatmap)
library(RColorBrewer)
library(tidyverse)
library(gplots)
#整合数据-挑选了三个绝对丰度的结果
data_total <- cbind(ciber_res[,-c(24:26)],epic_res[,-1],quantiseq_res[,-1])
data_total <- as.data.frame(t(data_total))
colnames(data_total) <- data_total[1,]
data_total <- dplyr::slice(data_total,-1)
head(data_total)[1:5,1:5]
#                               GSM3106326   GSM3106327   GSM3106328   GSM3106329   GSM3106330
# B_cells_naive_CIBERSORT     0.0142251364 0.0281206504 0.0308628121 0.0335228312 0.0471364646
# B_cells_memory_CIBERSORT     0.000000000  0.000000000  0.000000000  0.000000000  0.000000000
# Plasma_cells_CIBERSORT       0.069795054  0.007039938  0.025673056  0.035549703  0.026348560
# T_cells_CD8_CIBERSORT        0.045625893  0.021829211  0.022410222  0.038619601  0.066552186
# T_cells_CD4_naive_CIBERSORT   0.00000000   0.00000000   0.00000000   0.00000000   0.00000000

# 检查数据类型是否正确转换为数值型
data_total <- data_total %>% mutate_all(as.numeric)

# 数据标准化-方法1
# stand_fun <- function(indata=NULL, halfwidth=NULL, centerFlag=T, scaleFlag=T) {  
#                   outdata=t(scale(t(indata), center=centerFlag, scale=scaleFlag))
#                   if (!is.null(halfwidth)) {
#                      outdata[outdata > halfwidth] = halfwidth
#                      outdata[outdata < (-halfwidth)] = -halfwidth
#   }
#   return(outdata)
# }
#data_total <- stand_fun(data_total,halfwidth = 2)

# 标准化-方法2
data_total <- scale(data_total)

# 重新对列进行排序
# group_list 是一个向量，表示每个样本的类型(PAH 或 control)
sorted_index <- order(group_list, decreasing = FALSE)
# 使用排序索引对 data_total 重新排序列
data_total <- data_total[, sorted_index]
# 重新对 group_list 进行排序，以匹配列的顺序
group_list <- group_list[sorted_index]

# 创建注释
# 列注释
annCol <- data.frame(Type = group_list,
                     row.names = colnames(data_total),
                     stringsAsFactors = FALSE)
# 行注释
# 从行名中提取方法
methods <- sub('.*_', '', rownames(data_total))
# 创建行注释数据框
annRow <- data.frame(Methods = factor(methods, levels = unique(methods)),
                     row.names = rownames(data_total),
                     stringsAsFactors = FALSE)


# 调整颜色梯度
breaksList = seq(-5, 5, by = 0.1)
colors <- colorRampPalette(c("#336699", "white", "tomato"))(length(breaksList))

# 绘制热图
pheatmap(data_total,
         annotation_col = annCol,
         annotation_row = annRow,
         color = colors,
         breaks = breaksList,
         cluster_rows = FALSE,
         cluster_cols = FALSE,
         show_rownames = TRUE,
         show_colnames = FALSE,
         gaps_col = cumsum(table(annCol$Type)),  # 使用排序后的列分割点
         gaps_row = cumsum(table(annRow$Methods)), # 行分割
         fontsize_row = 6,
         fontsize_col = 6,
         annotation_names_row = FALSE
         )

10.2 complexheatmap热图

library(ComplexHeatmap)
#整合数据-挑选了三个绝对丰度的结果
data_total <- cbind(ciber_res[,-c(24:26)],epic_res[,-1],quantiseq_res[,-1])
data_total <- as.data.frame(t(data_total))
colnames(data_total) <- data_total[1,]
data_total <- dplyr::slice(data_total,-1)
head(data_total)[1:5,1:5]

# 检查数据类型是否正确转换为数值型
data_total <- data_total %>% mutate_all(as.numeric)
# 标准化-方法3
# 范围限定在-3至3
data_total <- scale(data_total)
data_total[data_total > 3] <- 3
data_total[data_total + 3 < 0] <- -3

# 也可以用circlize创建颜色映射
# 定义颜色渐变
library(circlize)
color_fun <- colorRamp2(c(-3, 0, 3), c("#336699", "white", "tomato"))

# pd$type 和 pd$samples 是注释数据怎么上色
type_colors <- c("control" = "green", "PAH" = "red")
samples_colors <- c("all PAH" = "blue", "CTEPH patient" = "orange", "FD" = "purple",
                    "idiopathic PAH patient" = "brown", "normal control" = "yellow",
                    "patient with PAH and CHD" = "pink", "patient with PAH and CTD" = "magenta",
                    "pulmonary arterial hypertension (PAH) patient" = "cyan")

columnAnno <- HeatmapAnnotation(type = pd$group,
                                samples = pd$samples,
                                col = list(type = type_colors, samples = samples_colors))

ComplexHeatmap::Heatmap(data_total, 
                        na_col = "white",
                        col = color_fun,  # 添加颜色映射函数
                        show_column_names = F,
                        row_names_side = "right",
                        name = "fraction",
                        column_order = c(colnames(data_total)[c(grep("control",pd$group),
                                                                grep("PAH",pd$group))]),
                        column_split = pd$group, 
                        column_title = NULL,
                        cluster_columns = F,
                        top_annotation = columnAnno,
                        heatmap_width = unit(20, "cm"),  # 调整热图宽度
                        row_dend_width = unit(1, "cm"),  # 调整聚类树宽度
                        cluster_rows = T, # 行聚类
                        row_names_gp = gpar(fontsize = 8)  # 调整行名字体大小
                        #cluster_columns = FALSE # 列聚类。                     
                        )

总的来说，这个R包还是挺快捷方便的~ 里面的参数需要根据自己的数据进行调整。

参考资料：

1、Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade. Cell Rep. 2017 Jan 3;18(1):248-262.

2、IOBR： https://mp.weixin.qq.com/s/kb22gW6bOk_WSFt39bbyug https://iobr.github.io/book/

3、TCIA (The cancer immunome atlas)： https://tcia.at/tools/toolsMain

4、生信技能树： https://mp.weixin.qq.com/s/nMeDmyO4M09z7vGnYU2K-A

注：若对内容有疑惑或者有发现明确错误的朋友，请联系后台(欢迎交流)。更多内容可关注公众号：生信方舟

- END -