1.读取数据
library(dplyr)
library(Seurat)
library(patchwork)
# Load the PBMC dataset
pbmc.data <- Read10X(data.dir ="../data/pbmc3k/filtered_gene_bc_matrices/hg19/")
# Initialize the Seurat object with the raw (non-normalized data).
pbmc <- CreateSeuratObject(counts = pbmc.data, project = "pbmc3k", min.cells = 3, min.features = 200)
pbmc
Read10X函数读取数据后返回的是UMI count矩阵,接下来用count矩阵创建seurat对象,在这一步就可以做质控,min.cell = n是指一个基因至少在n个细胞里表达,min.features=m是指一个细胞至少表达m个基因。
2.标准流程
这一步包括QC、数据标准化、确定高变异基因、缩放(归一化)。
2.1 QC
细胞低质量的指标
pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")
用来计算每个细胞里某个pattern的基因的比例
VlnPlot(pbmc, features = c("nFeature_RNA", "nCount_RNA", "percent.mt"), ncol = 3)
还可以看不同特征的线性关系
plot1 <- FeatureScatter(pbmc, feature1 = "nCount_RNA", feature2 = "percent.mt")