单细胞|MEBOCOST·基于代谢物的细胞通讯预测(一)

import os,sys
import scanpy as sc
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
from mebocost import mebocost

1. 创建 mebocost 对象

adata = sc.read_h5ad('data/demo/raw_scRNA/demo_HNSC_200cell.h5ad')
## check adata (cells, genes)
print(adata.shape)
## initiate the mebocost object
### import expression data by scanpy adata object
mebo_obj = mebocost.create_obj(
                        adata = adata,
                        group_col = ['celltype'],
                        met_est = 'mebocost',
                        config_path = './mebocost.conf',
                        exp_mat=None,
                        cell_ann=None,
                        species='human',
                        met_pred=None,
                        met_enzyme=None,
                        met_sensor=None,
                        met_ann=None,
                        scFEA_ann=None,
                        compass_met_ann=None,
                        compass_rxn_ann=None,
                        cutoff_exp='auto',
                        cutoff_met='auto',
                        cutoff_prop=0.25,
                        sensor_type=['Receptor', 'Transporter', 'Nuclear Receptor'],
                        thread=8
                        )

2. 代谢通讯推断 

## initiate the mebocost object
mebo_obj = mebocost.create_obj(
                        adata = adata,
                        group_col = ['celltype'],
                        met_est = 'mebocost',
                        config_path = './mebocost.conf',
                        exp_mat=None,
                        cell_ann=None,
                        species='human',
                        met_pred=None,
                        met_enzyme=None,
                        met_sensor=None,
                        met_ann=None,
                        scFEA_ann=None,
                        compass_met_ann=None,
                        compass_rxn_ann=None,
                        cutoff_exp='auto',
                        cutoff_met='auto',
                        cutoff_prop=0.25,
                        sensor_type=['Receptor', 'Transporter', 'Nuclear Receptor'],
                        thread=8
                        )
## metabolic communication inference, this step takes a while
commu_res = mebo_obj.infer_commu(
                                n_shuffle=1000,
                                seed=12345, 
                                Return=True, 
                                thread=None,
                                save_permuation=False,
                                min_cell_number = 1,
                                pval_method='permutation_test_fdr',
                                pval_cutoff=0.05
                            )

3. 可视化

a. 查看每种细胞类型sender和receiver数量

## sender and receiver event number
mebo_obj.eventnum_bar(
                    sender_focus=[],
                    metabolite_focus=[],
                    sensor_focus=[],
                    receiver_focus=[],
                    xorder=[],
                    and_or='and',
                    pval_method='permutation_test_fdr',
                    pval_cutoff=0.05,
                    comm_score_col='Commu_Score',
                    comm_score_cutoff = 0,
                    cutoff_prop = 0.25,
                    figsize='auto',
                    save=None,
                    show_plot=True,
                    show_num = True,
                    include=['sender-receiver'],
                    group_by_cell=True,
                    colorcmap='tab20',
                    return_fig=False
                )

b. 不同细胞类型的通讯情况

## circle plot to show communications between cell groups
mebo_obj.commu_network_plot(
                    sender_focus=[],
                    metabolite_focus=[],
                    sensor_focus=[],
                    receiver_focus=[],
                    and_or='and',
                    pval_method='permutation_test_fdr',
                    pval_cutoff=0.05,
                    node_cmap='tab20',
                    figsize='auto',
                    line_cmap='bwr',
                    line_color_vmin=None,
                    line_color_vmax=None,
                    linewidth_norm=(0.2, 1),
                    linewidth_value_range = None,
                    node_size_norm=(50, 200),
                    node_value_range = None,
                    adjust_text_pos_node=True,
                    node_text_hidden = False,
                    node_text_font=10,
                    save=None,
                    show_plot=True,
                    comm_score_col='Commu_Score',
                    comm_score_cutoff=0,
                    text_outline=True,
                    return_fig=False
                )

### the "overall score" represent the sum of -log10(FDR) of detected metabolite-sensor communications between a pair of cell types

### dot plot to show the number of communications between cells

mebo_obj.count_dot_plot(
                        pval_method='permutation_test_fdr',
                        pval_cutoff=0.05,
                        cmap='bwr',
                        figsize='auto',
                        save=None,
                        dot_size_norm =(20, 200),
                        dot_value_range = None,
                        dot_color_vmin=None,
                        dot_color_vmax=None,
                        show_plot=True,
                        comm_score_col='Commu_Score',
                        comm_score_cutoff=0,
                        dendrogram_cluster=True,
                        sender_order=[],
                        receiver_order=[],
                        return_fig = False
                    )

 c. 详细的通讯情况(sender-receiver vs metabolite-sensor),可以指定receiver_focus/sensor_focus查看特定细胞类型。

## Malignant cell was focused, use receiver_focus=[] to include all cell types
mebo_obj.commu_dotmap(
                sender_focus=[],
                metabolite_focus=[],
                sensor_focus=[],
                receiver_focus=['Malignant'],
                and_or='and',
                pval_method='permutation_test_fdr',
                pval_cutoff=0.05,
                figsize='auto',
                cmap='bwr',
                cmap_vmin = None,
                cmap_vmax = None,
                cellpair_order=[],
                met_sensor_order=[],
                dot_size_norm=(10, 150),
                save=None,
                show_plot=True,
                comm_score_col='Commu_Score',
                comm_score_range = None,
                comm_score_cutoff=0,
                swap_axis = False,
                return_fig = False
                )

 

d. 通信流可视化 

## Malignant cell was focused, use receiver_focus=[] to include all cell types
mebo_obj.FlowPlot(
                pval_method='permutation_test_fdr',
                pval_cutoff=0.05,
                sender_focus=[],
                metabolite_focus=[],
                sensor_focus=[],
                receiver_focus=['Malignant'],
                remove_unrelevant = False,
                and_or='and',
                node_label_size=8,
                node_alpha=0.6,
                figsize='auto',
                node_cmap='Set1',
                line_cmap='bwr',
                line_cmap_vmin = None,
                line_cmap_vmax = 15.5,
                node_size_norm=(20, 150),
                node_value_range = None,
                linewidth_norm=(0.5, 5),
                linewidth_value_range = None,
                save='test.pdf',
                show_plot=True,
                comm_score_col='Commu_Score',
                comm_score_cutoff=0,
                text_outline=False,
                return_fig = False
            )

 e. 可视化细胞亚群的代谢物水平

## violin plot to show the estimated metabolite abundance of informative metabolties in communication
### here we show five significant metabolites,
### users can pass several metabolites of interest by provide a list
commu_df = mebo_obj.commu_res.copy()
good_met = commu_df[(commu_df['permutation_test_fdr']<=0.05)]['Metabolite_Name'].sort_values().unique()

mebo_obj.violin_plot(
                    sensor_or_met=good_met[:5], ## only top 5 as example
                    cell_focus=[],
                    cell_order = [],
                    row_zscore = False,
                    cmap=None,
                    vmin=None,
                    vmax=None,
                    figsize='auto',
                    cbar_title='',
                    save=None,
                    show_plot=True
                    )

 f. 可视化细胞亚群的senor水平

## violin plot to show the expression of informative sensors in communication

good_sensor = commu_df[(commu_df['permutation_test_fdr']<=0.05)]['Sensor'].sort_values().unique()

mebo_obj.violin_plot(
                    sensor_or_met=good_sensor[:5],## only top 5 as example
                    cell_focus=[],
                    cell_order = [],
                    row_zscore = False,
                    cmap=None,
                    vmin=None,
                    vmax=None,
                    figsize='auto',
                    cbar_title='',
                    save=None,
                    show_plot=True
                    )

 参考:MEBOCOST/Demo_Communication_Prediction.ipynb at master · zhengrongbin/MEBOCOST (github.com)

### 使用R语言进行单细胞代谢定量分析 在单细胞水平上研究代谢活动对于理解细胞异质性和功能至关重要。为了实现这目标,可以利用系列生物信息学工具和方法来处理和解释来自单细胞实验的数据。 #### 数据预处理 通常情况下,在开始任何类型的定量之前,需要先对原始数据进行质量控制(QC),过滤低质量读数以及去除潜在污染[^2]。这步骤可以通过`Seurat`或`Scanpy`(Python)完成初步QC工作;然而,当专注于代谢特征时,则可能需要用到特定于代谢物检测平台的软件如XCMS或MZmine来进行峰检出与校正操作[^1]。 #### 整合多组学数据集 由于单独依靠转录本无法全面反映实际发生的生理过程变化,因此建议采用联合分析策略——即把mRNA表达谱(scRNA-seq)同液相色谱-质谱联用(LC-MS)/气相色谱-质谱联用(GC-MS)获得的小分子浓度测量结合起来考虑。这种做法有助于更准确地描绘出不同条件下个体细胞内部复杂的生化反应网络状况[^3]。 #### 应用统计模型评估差异丰度 旦获得了经过清理后的高质量计数值矩阵之后,就可以运用线性回归或其他高级机器学习算法(例如随机森林、支持向量机等),通过比较对照组vs. 实验组间各靶标物质平均含量是否存在显著区别从而找出具有生物学意义的关键调控因子[^4]。在此过程中,`limma`包提供了方便易用的功能接口用于执行两样本t检验/ANOVA测试并计算p值调整后的FDR得分。 ```r library(limma) design <- model.matrix(~0 + group, data=colData) fit <- lmFit(counts_matrix, design) contrast.matrix <- makeContrasts( levels=design, Treat_vs_Control = Treatment-Control ) fit2 <- contrasts.fit(fit, contrast.matrix) eb <- eBayes(fit2) topTable(eb, adjust="fdr", number=nrow(counts_matrix)) ``` 上述代码片段展示了如何构建零假设下的广义线性模型(GLM), 并基于此框架内实施多重假设检验矫正程序以降低假阳性率的影响程度。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值