metaphlan3和strainphlan3运行记录

我所用的conda环境是metaphlan3.0.7,之前跑的数据用的都是v30的数据库。 但是由于目前metaphlan已经升级到v31,如果直接运行metaphlan,会自动下载v31的数据库。

指定数据库版本后遇到的问题1:

command:

https://forum.biobakery.org/t/install-metaphlan-3/369

(humann) user@d079e601f094:/analysis$ metaphlan P10E0.fastq.gz \
-x mpa_v30_CHOCOPhlAn_201901 \
--input_type fastq -s P10E0.new.sam.bz2 \
--bowtie2out P10E0.new.bowtie2.bz2 \
-o P10E0_new_profiled.tsv

报错:BowTie2 output file detected: /sbidata/projects/lzhang/202012_small_molecule/Analysis/metaphlan_human3/Result/StrainPhlAn/test_project/bowtie2/P10E0.new.bowtie2.bz2

Please provide the size of the metagenome using the --nreads parameter when running MetaPhlAn using SAM files as input
Exiting...

WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
 

我用的是fastq文件,怎么会报错bowite2的问题呢,通过查阅发现是软件本身bug。

https://forum.biobakery.org/t/metaphlan3-bowtie2db-output-files-need-the-size-of-the-metagenome-using-the-nreads-parameter/2006/11

来自于帖子的解决办法:https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0

I merged the PR, it should not take long to be available now

fbeghiniSegata Lab member

May '21

I merged the PR, it should not take long to be available now

总结下解决办法:since there is an error in 3.0.7, I will install 3.0.8 in my own conda environment. 重新安装3.0.8的环境 。

问题2:安装metaphlan3的旧环境(因为现在已经升级到4了)

使用conda安装速度在我的计算机上(Ubuntu)太慢,选择用mamba安装。

conda create --name mpa_strainphlan3

conda install -c conda-forge mamba

mamba install -c bioconda metaphlan=3.0.8 python=3.10.8

 Updating specs:

   - metaphlan=3.0.8
 

重新跑上面的命令行,仍然报错

Please provide the size of the metagenome using the --nreads parameter when running MetaPhlAn using SAM files as input
Exiting...

WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
 

尝试安装3.0.9 

mamba install -c bioconda metaphlan=3.0.9
Traceback (most recent call last):
  File "/home/l/miniconda3/envs/mpa_strainphlan3/bin/mamba", line 7, in <module>
    from mamba.mamba import main
  File "/home/l/miniconda3/envs/mpa_strainphlan3/lib/python3.10/site-packages/mamba/mamba.py", line 49, in <module>
    import libmambapy as api
  File "/home/l/miniconda3/envs/mpa_strainphlan3/lib/python3.10/site-packages/libmambapy/__init__.py", line 7, in <module>
    raise e
  File "/home/l/miniconda3/envs/mpa_strainphlan3/lib/python3.10/site-packages/libmambapy/__init__.py", line 4, in <module>
    from libmambapy.bindings import *  # noqa: F401,F403
ImportError: /home/l/miniconda3/envs/mpa_strainphlan3/lib/python3.10/site-packages/libmambapy/../../../libmamba.so.2: undefined symbol: archive_write_add_filter_zstd
 

这个报错解决方法:Undefined symbol: archive_write_add_filter_zstd · Issue #1775 · mamba-org/mamba · GitHub

conda install libarchive==3.5.2 -c conda-forge

升级metaphlan3.0.9

mamba install -c bioconda metaphlan=3.0.9

步骤一:环境安装好之后就是先跑metaphlan3

metaphlan P10E0.new.bowtie2.bz2 \
--bowtie2db HUMAnN3_db/stable_201901b/db_v30/ \
--input_type bowtie2out \
-s P10E0.new.sam.bz2 --bowtie2out P10E0.new.bowtie2.bz2 \
-o P10E0_new_profiled.tsv
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.

warnings 没关系,只是通知你有些species它们可以有别的名字。

These are from MetaPhlAn, they just inform you that some species found can have “alternative” taxonomies (the list of species in the additional_species column). All the species listed under additional_species are not represented by any markers but they were found to be <5% ANI distant from the “reference” species (clade_name).

Unexpected output (format) - #2 by fbeghini - MetaPhlAn - The bioBakery help forum

这里的bowtie2db的数据库也可以用--index mpa_v30_CHOCOPhlAn_201901 软件会自动下载。或者自己从dropbox或者googledrive或者zento下载再解压就可以了,索引metaphlan自己会建立好。

MetaPhlAn 3.0 · biobakery/MetaPhlAn Wiki · GitHub 这里又提到数据库下载地址。

比如 zento数据库 

curl -o mpa_v30_CHOCOPhlAn_201901.tar "https://zenodo.org/record/3957592/files/mpa_v30_CHOCOPhlAn_201901.tar?download=1"

步骤二 :strainphlan3

sample2markers.py -i sams/P10E0.new.sam.bz2 \
	-o consensus_markers/P10E0.new.pkl -n 8
提取ecoli的序列

extract_markers.py -d /opt/conda/envs/humann/lib/python3.9/site-packages/metaphlan/metaphlan_databases/mpa_v31_CHOCOPhlAn_201901.pkl -c s__Escherichia_coli -o clade_markers


strainphlan -d shared2/HUMAnN3_db/stable_201901b/db_v30/mpa_v30_CHOCOPhlAn_201901.pkl -s consensus_markers/P10E0.new.pkl -m clade_markers/s__Escherichia_coli.fna -o output -n 8 -c s__Escherichia_coli --mutation_rates


[e] The main inputs samples + references are less than 4
Wed Jan 11 15:09:48 2023: Stop StrainPhlAn 3.0 execution.

这里error说明至少需要4个样本才能运行,添加多几个样本,再重新跑。

5个样本脚本如下:

# first cat together the fq files into merge_fq/test dir 
# second run metaphlan get bowtie mapping file 
cd test_project/
mkdir -p sams/
mkdir -p bowtie2/
mkdir -p profiles/


for f in merge_fq/test/*gz
do
    echo "Running MetaPhlAn on ${f}"
    bn=$(basename ${f})
    bn=`echo $bn|sed s/.fastq.gz//`
    echo $bn
    metaphlan ${f} --input_type fastq --bowtie2db /shared2/HUMAnN3_db/stable_201901b/db_v30/ -s sams/${bn}.sam.bz2 --bowtie2out bowtie2/${bn}.bowtie2.bz2 -o profiles/${bn}_profiled.tsv
done

# third : extract consensus markers 
mkdir -p consensus_markers
sample2markers.py -i sams/*.sam.bz2 -o consensus_markers -n 8

# forth : extract ecoli sequence 

mkdir -p clade_markers
extract_markers.py -d /shared2/HUMAnN3_db/stable_201901b/db_v30/mpa_v30_CHOCOPhlAn_201901.pkl -c s__Escherichia_coli -o clade_markers
mkdir -p output
strainphlan -d /shared2/HUMAnN3_db/stable_201901b/db_v30/mpa_v30_CHOCOPhlAn_201901.pkl -s consensus_markers/*.pkl -m clade_markers/s__Escherichia_coli.fna -o output -n 8 -c s__Escherichia_coli --mutation_rates

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值