7.ArchR的onto整合(2)

本文介绍了在ArchR项目中如何进行onto整合,特别是针对MPAL-scATAC数据映射到Healthy-scATAC数据的过程。通过提取PeakMatrix或GeneScoreMatrix,使用calcLSI和projectLSI函数,作者探讨了鉴定Healthy-like细胞的方法,并分享了优化策略,包括调整projectBulkATAC的参数以提高精度。最后,作者承诺将在后续文章中提供完整实现步骤。
摘要由CSDN通过智能技术生成

愿武艺晴小朋友一定得每天都开心 


      上一章提到 projectLSI 函数能进行 into 投影的操作,这一章的话,我想尝试让它在MPAL-scATAC data 映射给Healthy-scATAC data时是能够被走通的。

注意事项:

“ ArchRProjects 对象不能跨个体间进行比较,也不能合并ArchRProjects ”——来自greenleaf团队的Rcorces

因此应从Fragments文件制作Arrow文件 

采取策略的目录:

1)PeakMatrix或者GeneScoreMatrix 全部提取出来,得到SE对象;进一步得到稀疏矩阵;  

2)calcLSI 函数

3)projectLSI 函数 

## code的大框架和雏形如下:

> seDisease <- getMatrixFromProject(ArchRProj = projHeme2MPAL1,useMatrix = "GeneScoreMatrix")
ArchR logging to : ArchRLogs/ArchR-getMatrixFromProject-5b8e1992f907-Date-2024-05-09_Time-12-54-45.log
If there is an issue, please report to github with logFile!
2024-05-09 12:55:19 : Organizing colData, 0.557 mins elapsed.
2024-05-09 12:55:19 : Organizing rowData, 0.559 mins elapsed.
2024-05-09 12:55:19 : Organizing rowRanges, 0.56 mins elapsed.
2024-05-09 12:55:19 : Organizing Assays (1 of 1), 0.56 mins elapsed.
2024-05-09 12:55:20 : Constructing SummarizedExperiment, 0.58 mins elapsed.
2024-05-09 12:55:24 : Finished Matrix Creation, 0.636 mins elapsed.
> rownames(seDisease) <- rowData(seDisease)$name
> seReference <- getMatrixFromProject(ArchRProj = projHeme2_healthy,useMatrix = "GeneScoreMatrix")
> rownames(seReference) <- rowData(seReference)$name
 
#Create Matrix
> mat <- cbind(assay(seReference), assay(seDisease)) #从这可以推断出seReference和seDisease是summarizedExperiment对象
> #Set Clustering Parameters
> nPCs1 <- 1:25
> nPCs2 <- 1:25
> resolution <- 0.8  # clustering resolution
> nTop <- 25000      # number of variable quantity
> mat <- cbind(assay(seReference), assay(seDisease))    #从这可以推断出seReference和seDisease是summarizedExperiment对象
> head(mat)       # must 保证rows match
> ##############  Run LSI 1st Iteration ####################
> lsi1 <- calcLSI(mat, nComponents = 50, binarize = TRUE, nFeatures = NULL) 
    #这里mat要是S4 object | sparse Matrix |count matrix
Binarizing matrix...
Computing Term Frequency IDF...
Computing SVD using irlba...

> clust1 <- seuratSNN(lsi1[[1]], dims = nPCs1, resolution = resolution) 
Making Seurat Object...
Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from PC to PC_
Warning: All keys should be one or more alphanumeric characters followed by an underscore '_', setting key to PC_
Computing nearest neighbor graph
Computing SNN
Warning: The following arguments are not used: dims
Warning: The following arguments are not used: dims
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 42501
Number of edges: 1083338

Running Louvain algorithm...
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8663
Number of communities: 25
Elapsed time: 6 seconds
> #Make Pseudo Bulk Library
> message("Making PseudoBulk...")
Making PseudoBulk...
> mat <- mat[,rownames(lsi1[[1]]), drop = FALSE] #sometimes cells are filtered
> mat@x[mat@x > 0] <- 1  #binarize
> clusterSums <- groupSums(mat = mat, groups = clust1, sparse = TRUE) #Group Sums
> logMat <- edgeR::cpm(clusterSums, log = TRUE, prior.count = 3) #log CPM matrix
> varPeaks <- head(order(matrixStats::rowVars(logMat), decreasing = TRUE), nTop) #Top variable peaks
> #Run LSI 2nd Iteration
> lsi2 <- calcLSI(mat[varPeaks,,drop=FALSE], nComponents = 50, binarize = TRUE, nFeatures = NULL)
Binarizing matrix...
Computing Term Frequency IDF...
Computing SVD using irlba...
> clust2 <- seuratSNN(lsi2[[1]], dims.use = nPCs2, resolution = resolution)
Making Seurat Object...
Warning: Keys should be one or mor
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值