愿武艺晴小朋友一定得每天都开心
上一章提到 projectLSI 函数能进行 into 投影的操作,这一章的话,我想尝试让它在MPAL-scATAC data 映射给Healthy-scATAC data时是能够被走通的。
注意事项:
“ ArchRProjects 对象不能跨个体间进行比较,也不能合并ArchRProjects ”——来自greenleaf团队的Rcorces
因此应从Fragments文件制作Arrow文件
采取策略的目录:
1)PeakMatrix或者GeneScoreMatrix 全部提取出来,得到SE对象;进一步得到稀疏矩阵;
2)calcLSI 函数
3)projectLSI 函数
## code的大框架和雏形如下:
> seDisease <- getMatrixFromProject(ArchRProj = projHeme2MPAL1,useMatrix = "GeneScoreMatrix") ArchR logging to : ArchRLogs/ArchR-getMatrixFromProject-5b8e1992f907-Date-2024-05-09_Time-12-54-45.log If there is an issue, please report to github with logFile! 2024-05-09 12:55:19 : Organizing colData, 0.557 mins elapsed. 2024-05-09 12:55:19 : Organizing rowData, 0.559 mins elapsed. 2024-05-09 12:55:19 : Organizing rowRanges, 0.56 mins elapsed. 2024-05-09 12:55:19 : Organizing Assays (1 of 1), 0.56 mins elapsed. 2024-05-09 12:55:20 : Constructing SummarizedExperiment, 0.58 mins elapsed. 2024-05-09 12:55:24 : Finished Matrix Creation, 0.636 mins elapsed. > rownames(seDisease) <- rowData(seDisease)$name
> seReference <- getMatrixFromProject(ArchRProj = projHeme2_healthy,useMatrix = "GeneScoreMatrix") > rownames(seReference) <- rowData(seReference)$name #Create Matrix > mat <- cbind(assay(seReference), assay(seDisease)) #从这可以推断出seReference和seDisease是summarizedExperiment对象
> #Set Clustering Parameters > nPCs1 <- 1:25 > nPCs2 <- 1:25 > resolution <- 0.8 # clustering resolution > nTop <- 25000 # number of variable quantity
> mat <- cbind(assay(seReference), assay(seDisease)) #从这可以推断出seReference和seDisease是summarizedExperiment对象 > head(mat) # must 保证rows match
> ############## Run LSI 1st Iteration #################### > lsi1 <- calcLSI(mat, nComponents = 50, binarize = TRUE, nFeatures = NULL) #这里mat要是S4 object | sparse Matrix |count matrix Binarizing matrix... Computing Term Frequency IDF... Computing SVD using irlba... > clust1 <- seuratSNN(lsi1[[1]], dims = nPCs1, resolution = resolution) Making Seurat Object... Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from PC to PC_ Warning: All keys should be one or more alphanumeric characters followed by an underscore '_', setting key to PC_ Computing nearest neighbor graph Computing SNN Warning: The following arguments are not used: dims Warning: The following arguments are not used: dims Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 42501 Number of edges: 1083338 Running Louvain algorithm... 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Maximum modularity in 10 random starts: 0.8663 Number of communities: 25 Elapsed time: 6 seconds
> #Make Pseudo Bulk Library > message("Making PseudoBulk...") Making PseudoBulk... > mat <- mat[,rownames(lsi1[[1]]), drop = FALSE] #sometimes cells are filtered > mat@x[mat@x > 0] <- 1 #binarize > clusterSums <- groupSums(mat = mat, groups = clust1, sparse = TRUE) #Group Sums > logMat <- edgeR::cpm(clusterSums, log = TRUE, prior.count = 3) #log CPM matrix > varPeaks <- head(order(matrixStats::rowVars(logMat), decreasing = TRUE), nTop) #Top variable peaks
> #Run LSI 2nd Iteration > lsi2 <- calcLSI(mat[varPeaks,,drop=FALSE], nComponents = 50, binarize = TRUE, nFeatures = NULL) Binarizing matrix... Computing Term Frequency IDF... Computing SVD using irlba... > clust2 <- seuratSNN(lsi2[[1]], dims.use = nPCs2, resolution = resolution) Making Seurat Object... Warning: Keys should be one or mor