高通量测序结果是一种fastq格式
一个fastq格式的文件经fastqc统计后,会生成两个文件
fastqc -t 2 /root/project/rnaseq/01raw_data/sra_data.fastq
#Analysis complete for sra_data.fastq
sra_data_fastqc.html sra_data_fastqc.zip
MultiQC是一个报告工具,它可以从其他生物信息学工具生成的结果和日志文件中分析汇总统计信息。MultiQC不会为您运行其他工具–它被设计成放在分析管道的末端,或者在您运行完工具后手动运行。
当您启动MultiQC时,它会递归地搜索任何提供的文件路径,并找到它识别的文件。它从这些文件中解析相关信息,并生成一个独立的HTML报告文件。它还会保存一个包含所有解析过的数据文件的目录,供下游使用。
问题来了
multiqc 会自动搜素相应的文件,进行整合
multiqc /root/project/rnaseq/01raw_data/
报错提示
/root/miniconda3/envs/rna/lib/python2.7/site-packages/multiqc/utils/config.py:45: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
configs = yaml.load(f)
/root/miniconda3/envs/rna/lib/python2.7/site-packages/multiqc/utils/config.py:51: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
sp = yaml.load(f)
[WARNING] multiqc : MultiQC Version v1.8 now available!
[INFO ] multiqc : This is MultiQC v1.4
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching '/root/project/rnaseq/01raw_data/'
Traceback (most recent call last):
File "/root/miniconda3/envs/rna/bin/multiqc", line 751, in <module>
multiqc()
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/root/miniconda3/envs/rna/bin/multiqc", line 398, in multiqc
template_mod = config.avail_templates[config.template].load()
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2340, in load
self.require(*args, **kwargs)
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2363, in require
items = working_set.resolve(reqs, env, installer, extras=self.extras)
File "/root/miniconda3/envs/rna/lib/python2.7/site-packages/pkg_resources/__init__.py", line 872, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (kiwisolver 0.1.3 (/root/miniconda3/envs/rna/lib/python2.7/site-packages), Requirement.parse('kiwisolver>=1.0.1'), set(['matplotlib']))
更新到multiqc 1.8 也没有用 ,还是报错
[INFO ] multiqc : This is MultiQC v1.8
[INFO ] multiqc : Template : default
[WARNING] multiqc : You are running MultiQC with Python 2.7.15
[WARNING] multiqc : Please upgrade! MultiQC will soon drop support for Python < 3.6
[INFO ] multiqc : Searching : /root/project/rnaseq/01raw_data
因为这个版本的multiqc是python2.7编写的,而python2.7在2020年1月1日以后就停止更新了,相应的包慢慢地也可能会有各种bug,
解决:
安装python3.版的multiqc,
首先创建rnaseq3.7环境
conda create --name rnaseq3.7 python=3.7
激活环境
conda activate rnaseq3.7
考验网速的时候到了,
安装mulyiqc, 默认是最新版
#pip安装
pip install multiqc
#conda安装
conda install -c bioconda -c conda-forge multiqc
which multiqc
#/root/miniconda3/envs/rnaseq3.7/bin/multiqc
显示你的路径 说明安装成功,进入路径就可以直接用了,
运行multiqc
multiqc /root/project/rnaseq/01raw_data/
[INFO ] multiqc : This is MultiQC v1.8
[INFO ] multiqc : Template : default
[INFO ] multiqc : Searching : /root/project/rnaseq/01raw_data
[INFO ] fastqc : Found 1 reports
[INFO ] multiqc : Compressing plot data
[INFO ] multiqc : Report : multiqc_report.html
[INFO ] multiqc : Data : multiqc_data
[INFO ] multiqc : MultiQC complete
OK,可以进一步分析啦,,,,,,,