如何理解 AnnData ?

如何理解 AnnData ?

anndata是一个Python包,用于处理内存和磁盘上的带注释的数据矩阵,位于pandas和xarray之间。 anndata提供了广泛的计算效率特性,其中包括稀疏数据支持、惰性操作和PyTorch接口。

AnnData 里边的

1、obs是啥?

obs是对行(也就是Cell)的注释,比如说小明的注释是注释A,小红的注释是B,。。。整个obs

We can also subset the AnnData using these randomly generated cell types:
bdata = adata[adata.obs.cell_type == "B"]
bdata

2、var是啥?

var代表的是列,代表的是基因;代表对基因的注释结果;

3、uns是啥?

Unstructured metadata

AnnData has .uns, which allows for any unstructured metadata. This can be anything, like a list or a dictionary with some general information that was useful in the analysis of our data.

4、obsm是啥?

Observation/variable-level matrices

We might also have metadata at either level that has many dimensions to it, such as a UMAP embedding of the data. For this type of metadata, AnnData has the .obsm/.varm attributes. We use keys to identify the different matrices we insert. The restriction of .obsm/.varm are that .obsm matrices must length equal to the number of observations as .n_obs and .varm matrices must length equal to .n_vars. They can each independently have different number of dimensions.

Let’s start with a randomly generated matrix that we can interpret as a UMAP embedding of the data we’d like to store, as well as some random gene-level metadata:

adata.obsm["X_umap"] = np.random.normal(0, 1, size=(adata.n_obs, 2))

umap降维之后,相当于,每个细胞在空间中都有一个位置。这个位置是2维的!X_umap相当于一个metadata,这个metadata是二维的。当然adata还可以有个新的metada,是三维的降维结果,那么每个cell就有一个三维的注释结果,新的变量名可以是obsp

adata

5、varm是啥?

基因的metadata,高纬的;

6、obsp是啥?

obsp其实就是obsm,只不过是变量名不一样,但是对应的物理含义是完全类似的。都是高纬度的metadata。

但是对obsm是有限制的:

The restriction of .obsm/.varm are that .obsm matrices must length equal to the number of observations as .n_obs and .varm matrices must length equal to .n_vars. They can each independently have different number of dimensions.

如图所示:

7、Layers
Finally, we may have different forms of our original core data, perhaps one that is normalized and one that is not. These can be stored in different layers in AnnData. For example, let’s log transform the original data and store it in a layer:

-------------

这个是这个数据结构的学习教程:

https://anndata-tutorials.readthedocs.io/en/latest/getting-started.html

-------------------

2023年2月3日01:35:25

 1、obs是对行(神经元或细胞)的注释;
2、var是对列(时间或基因)的注释;

  • 4
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值