cellchat需要两个输入:基因表达矩阵 + cell label。表达矩阵行为基因列为细胞,标准化+log。cell label为dataframe,行名为细胞,第一列为cell label.
从seurat v3 object中提取cell chat 所需要的两个输入文件:
data.input <- GetAssayData(seurat_object, assay = "RNA", slot = "data") # normalized data matrix
labels <- Idents(seurat_object)
meta <- data.frame(group = labels, row.names = names(labels)) # create a dataframe of the cell labels
library(CellChat)
library(patchwork)
options(stringsAsFactors = FALSE)
- part I: data input and preprocessing, initialization of the cellchat object
(1)load data
# Here we load a scRNA-seq data matrix and its associated cell meta data
#load(url("https://ndownloader.figshare.com/files/25950872"))
# This is a combined data from two biological conditions: normal and diseases
load("/Users/suoqinjin/Documents/CellChat/tutorial/data_humanSkin_CellChat.rda")
data.input = data_humanSkin$data # normalized data matrix
meta = data_humanSkin$meta # a dataframe with rownames containing cell mata data
cell.use = rownames(meta)[meta$condition == "LS"] # extract the cell names from disease data
# Prepare input data for CelChat analysis
data.input = data.input[, cell.use]
meta = meta[cell.use, ]
# meta = data.frame(labels = meta$labels[cell.use], row.names = colnames(data.input)) # manually create a dataframe consisting of the cell labels
unique(meta$labels) # check the cell labels
#> [1] Inflam. FIB FBN1+ FIB APOE+ FIB COL11A1+ FIB cDC2
#> [6] LC Inflam. DC cDC1 CD40LG+ TC Inflam. TC
#> [11] TC NKT
#> 12 Levels: APOE+ FIB FBN1+ FIB COL11A1+ FIB Inflam. FIB cDC1 cDC2 ... NKT
(2)create a cellchat object
cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels")
(3)set the ligand-receptor interaction database
CellChatDB <- CellChatDB.human # use CellChatDB.mouse if running on mouse data
showDatabaseCategory(CellChatDB)
# Show the structure of the database
dplyr::glimpse(CellChatDB$interaction)
# use a subset of CellChatDB for cell-cell communication analysis
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling") # use Secreted Signaling
# use all CellChatDB for cell-cell communication analysis
# CellChatDB.use <- CellChatDB # simply use the default CellChatDB
# set the used database in the object
cellchat@DB <- CellChatDB.use
(4)preprocessing the gene expression data
# subset the expression data of signaling genes for saving computation cost
cellchat <- subsetData(cellchat) # This step is necessary even if using the whole database
future::plan("multiprocess", workers = 4) # do parallel
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat)
# project gene expression data onto PPI network (optional)
cellchat <- projectData(cellchat, PPI.human)
- Part II: Inference of cell-cell communication network
在核心功能computeCommuProb中,可以设置不同的阈值。默认为trimean(25%),也可以设置成10%、5%等。这个阈值什么意思?当某个cell group中表达某个pair的细胞比例低于此值时,他们的平均表达量就都是零(不会出现在结果里),调低阈值能发现更多通讯,但相应的噪音也更大。调整方法:type = "truncatedMean", trim = 0.1
(1)compute the communication probability and infer the celluar communication network
cellchat <- computeCommunProb(cellchat)
# Filter out the cell-cell communication if there are only few number of cells in certain cell groups
cellchat <- filterCommunication(cellchat, min.cells = 10)
(2) extract the infered celluar communication network as a dataframe