seurat to h5ad scanpy anndata对象转换 20231004 seurattoscanpyseuratscanpy convert

生信小博士

已于 2023-10-13 22:09:12 修改

阅读量971

点赞数

文章标签：服务器

于 2023-10-04 01:43:47 首次发布

本文链接：https://blog.csdn.net/qq_52813185/article/details/133534209

版权

Conversions: h5Seurat and AnnData • SeuratDisk (mojaveazure.github.io)https://mojaveazure.github.io/seurat-disk/articles/convert-anndata.html Convert an on-disk single-cell dataset to another format — Convert • SeuratDisk (mojaveazure.github.io)https://mojaveazure.github.io/seurat-disk/reference/Convert.html

To see how this dataset was generated, please run ?pbmc3k.final

Converting the Seurat object to an AnnData file is a two-step process. First, we save the Seurat object as an h5Seurat file. For more details about saving Seurat objects to h5Seurat files, please see this vignette; after the file is saved, we can convert it to an AnnData file for use in Scanpy. Full details about the conversion processes are listed in the manual page for the Convert function

To see how this dataset was created, please see this script

Converting the AnnData file to a Seurat object is a two-step process. First, convert the AnnData file to an h5Seurat file using the Convert function; full details about the conversion process are listed in the manual page. Then, we load the h5Seurat file into a Seurat object; for more details about loading Seurat objects from h5Seurat files, please see this vignette


# 在R中把数据导出成scanpy可以读取的格式 geneinfo  cellinfo counts
# # 
# geneinfo=All.merge@assays$RNA@meta.features
# write.csv(geneinfo,file =  "/home/data/t040413/silicosis/geneinfo.csv")
# # cell_info=All.merge@meta.data
# # write.csv(cell_info,file = "/home/data/t040413/silicosis/cell_info.csv",col.names = TRUE)
# # 
# # # Convert the counts data to a sparse matrix
# # counts_sparse <- Matrix::Matrix(as.matrix(GetAssayData(All.merge,slot = "counts")), sparse = TRUE)
# # # Save the sparse matrix in Matrix Market format (MM)
# # Matrix::writeMM(counts_sparse, file = "/home/data/t040413/silicosis/counts_sparse.mtx")
# 



import pandas as pd
cellinfo = pd.read_csv("./cell_info.csv",index_col=0)
geneinfo = pd.read_csv("./geneinfo.csv",index_col=0)

adata_ref=sc.read("./counts_sparse.mtx",index_col=0,header=None)
adata_ref=adata_ref.T  ########非常重要


adata_ref = sc.AnnData(adata_ref.X,obs=cellinfo,var = geneinfo)

adata_ref.var['SYMBOL'] = adata_ref.var.index

# find mitochondria-encoded (MT) genes
adata_ref.var['MT_gene'] = [gene.startswith('MT-') for gene in adata_ref.var['SYMBOL']]
adata_ref.var['mt_gene'] = [gene.startswith('mt-') for gene in adata_ref.var['SYMBOL']]

adata_ref.var.groupby('MT_gene').count()
adata_ref.var.groupby('mt_gene').count()


# remove MT genes for spatial mapping (keeping their counts in the object)
adata_ref.obsm['mt'] = adata_ref[:, adata_ref.var['mt_gene'].values].X.toarray()
adata_ref = adata_ref[:, ~adata_ref.var['mt_gene'].values]

convert seurat scrnaseq to annadata

library(Seurat, quietly = TRUE)
library(SeuratData, quietly = TRUE)
library(SeuratDisk, quietly = TRUE)
library(dplyr, quietly = TRUE)
library(ArchR, quietly = TRUE)

## load metadata
proj <- loadArchRProject(path = "../snATAC/DataIntegration/data/VisiumHeart", showLogo = FALSE)

## get a Seurat object for ATAC-seq
geneMatrix <- getMatrixFromProject(proj, useMatrix = "GeneScoreMatrix")
GeneScoreMatrix <- geneMatrix@assays@data$GeneScoreMatrix
rownames(GeneScoreMatrix) <- geneMatrix@elementMetadata$name

# load Seurat object
obj <- readRDS("../snATAC/DataIntegration/data/VisiumHeart/snATAC.annotated.Rds")

meta.data <- obj@meta.data
head(meta.data)

meta.data <- meta.data[, c("Sample", "cell_type")]

meta.data$cell_type <- as.character(meta.data$cell_type)

counts <- GeneScoreMatrix[, rownames(meta.data)]

dim(counts)

obj.atac <- CreateSeuratObject(counts = counts,
                               meta.data = meta.data,
                               assay = "RNA",
                              names.delim = "-") %>% 
            NormalizeData()

head(obj.atac@meta.data)

umap_embedding <- Embeddings(obj, reduction = "umap_harmony_v2")
rownames(umap_embedding) <- colnames(obj.atac)
colnames(umap_embedding) <- c("UMAP_1", "UMAP_2")
head(umap_embedding)

obj.atac[["umap"]] <- CreateDimReducObject(embeddings = umap_embedding, key = "UMAP_", assay = DefaultAssay(obj.atac))

DimPlot(obj.atac, reduction = "umap", pt.size = 0.5, group.by = "cell_type", label = TRUE)

SaveH5Seurat(obj.atac, filename = "snATAC-seq.h5Seurat")

Convert("snATAC-seq.h5Seurat", dest = "h5ad")