文章MSM_metagenomics(七):分组马赛克图

欢迎大家关注全网生信学习者系列:

  • WX公zhong号:生信学习者
  • Xiao hong书:生信学习者
  • 知hu:生信学习者
  • CDSN:生信学习者2

介绍

本教程是使用一个Python脚本来绘制马赛克图,用于可视化两个变量的频率分布。

数据

大家通过以下链接下载数据:

  • 百度网盘链接:https://pan.baidu.com/s/1f1SyyvRfpNVO3sLYEblz1A
  • 提取码: 请关注WX公zhong号_生信学习者_后台发送 复现msm 获取提取码

Python packages required

Drawing a mosaic plot using mosaic_plot.py

使用一个Python脚本mosaic_plot.py,以及一个包含MSMNon-MSM个体相关的物种的表格,这些物种被识别为革兰氏阴性或非革兰氏阴性,在two_variable_mosaic.tsv: ./data/two_variable_mosaic.tsv中。

  • mosaic_plot.py codes
#!/usr/bin/env python

"""
NAME: mosaic_plot.py
DESCRIPTION: mosaic_plot.py is a python script for visualizing proportions of data points along two variables.
"""


import pandas as pd
from scipy.stats import fisher_exact
import matplotlib.pyplot as plt
from statsmodels.graphics.mosaicplot import mosaic
import matplotlib
import sys
import argparse
import textwrap



def make_mosaic_plot(two_variable_file, facecolor_dict, output_fig, font_style = "sans-serif,Arial"):
    font_family, font_type = font_style.split(",")
    matplotlib.rcParams['font.family'] = font_family
    matplotlib.rcParams['font.sans-serif'] = font_type
    two_variable_df = pd.read_csv(two_variable_file, sep = "\t", index_col = False)
    features, variable1, variable2 = two_variable_df.columns
    cont_df = pd.crosstab(two_variable_df[variable1], two_variable_df[variable2])
    res = fisher_exact(cont_df, alternative = "two-sided")
    label_dict = {}
    for idx in cont_df.index.to_list():
        for col in cont_df.columns.to_list():
            label_dict[(idx, col)]  = cont_df.loc[idx, col]
    labelizer = lambda k:label_dict[k]
    
    variable2_0, variable2_1 = sorted(set(two_variable_df[variable2].to_list()))
    props = {}
    for variable in facecolor_dict:
        props[(variable, variable2_0)] = {"facecolor": facecolor_dict[variable], "edgecolor": "white"}
        props[(variable, variable2_1)] = {"facecolor": facecolor_dict[variable], "edgecolor": "white"}
    mosaic(two_variable_df, [variable1, variable2], labelizer = labelizer, properties = props, title = " P-value: "+ str(res[1]) + " (Fisher's exact test)")
    plt.savefig(output_fig)

if __name__ == "__main__":
    def read_args(args):
        # This function is to parse arguments

        parser = argparse.ArgumentParser(formatter_class=argparse.RawDescriptionHelpFormatter,
                                         description = textwrap.dedent('''\
                                         This program is to draw a mosaic plot.
                                         '''),
                                         epilog = textwrap.dedent('''\
                                         examples: mosaic_plot.py --input input_file.tsv --facecolor_map facecolor_mapfile.tsv --output mosaic_plot.png   
                                         '''))
        parser.add_argument('--input',
                             nargs = '?',
                             help = 'Input a file containing two variable information regarding each individual subject.',
                             type = str,
                             default = None)

        parser.add_argument('--facecolor_map',
                            nargs = '?',
                            help = 'Specify the the pathway to SCFA metabolisms database. default: /vol/projects/khuang/databases/SCFA/SCFA_pathways.tsv',
                            default = '/vol/projects/khuang/databases/SCFA/SCFA_pathways.tsv')

        parser.add_argument('--font_style',
                            nargs = '?',
                            help = 'Specify the font style, font family and font type is delimited by a comma. default: [sans-serif,Arial]',
                            default = 'sans-serif,Arial')

        parser.add_argument('--output',
                            nargs = '?',
                            help = 'Specify the output figure name.',
                            type = str,
                            default = None)

        return vars(parser.parse_args())
        
    pars = read_args(sys.argv)
    facecolor_dict = {i.rstrip().split("\t")[0]: i.rstrip().split("\t")[1] for i in open(pars['facecolor_map']).readlines()}
    make_mosaic_plot(pars["input"], facecolor_dict , pars["output"], font_style = pars["font_style"])
  • Usage:
mosaic_plot.py [-h] [--input [INPUT]] [--facecolor_map [FACECOLOR_MAP]] [--font_style [FONT_STYLE]] [--output [OUTPUT]]

This program is to draw a mosaic plot.

optional arguments:
  -h, --help            show this help message and exit
  --input [INPUT]       Input a file containing two variable information regarding each individual subject.
  --facecolor_map [FACECOLOR_MAP]
                        Specify the the pathway to SCFA metabolisms database. default: /vol/projects/khuang/databases/SCFA/SCFA_pathways.tsv
  --font_style [FONT_STYLE]
                        Specify the font style, font family and font type is delimited by a comma. default: [sans-serif,Arial]
  --output [OUTPUT]     Specify the output figure name.

examples: 

python mosaic_plot.py --input input_file.tsv --facecolor_map facecolor_mapfile.tsv --output mosaic_plot.png   

示例命令:

python mosaic_plot.py \
    --input two_variable_mosaic.tsv \
    --facecolor_map facecolor_map.tsv \
    --output mosaic_plot.png

请添加图片描述

Note

马赛克图的面颜色应该按照示例中的映射文件mapping file: ./data/facecolor_map.tsv来指定。

  • 3
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

生信学习者2

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值