目录
- 摘要
- 问题汇总
- 1. MissingInputException: Missing input files for rule XXX:
- 2. SyntaxError in line 28 of /path/to/snakefile: invalid syntax
- 3. SyntaxError in line 25 of /path/to/snakefile: Expected name or colon after rule or checkpoint keyword.
- 4. RuleException in line 114 of /path/to/snakefile: NameError: The name 'XXX' is unknown in this context. Did you mean 'wildcards.XXX'?
- 5. Wildcards in input files cannot be determined from output files: 'XXXXXX'
- 6. Not all output, log and benchmark files of rule trim_galore contain the same wildcards. This is crucial though, in order to avoid that two or more jobs write to the same file.
- 7. RuleException in line 113 of /path/to/snakefile: 'InputFiles'/'OutputFiles' object has no attribute 'xxx'
- 8.TypeError in line 140 of /path/to/snakefile: shellcmd() takes 2 positional arguments but 3 were given
- 9.NameError in line 156 of /path/to/snakefile: name 'xxx' is not defined
- 10. IndentationError in line 165 of : unindent does not match any outer indentation level
- 总结
摘要
最近用snakemake搭建了组装流程和WGS流程。有些问题会反复遇到,于是干脆把常见的snakemake问题整理出来,做好记录的同时,希望能帮助大家少走弯路。
问题汇总
1. MissingInputException: Missing input files for rule XXX:
输入文件找不到,请检查config文件里或者脚本引入的样品名称以及rule参与的整个pipe中的文件格式是否有问题(如fastq.gz写成fq.gz,R1/R2没有加R什么的)。报错里面的rule XXX 并不代表该规则有问题,而是这个规则所在的整个pipeline中的某个输入文件名有问题。由于snakemake的结果倒推机制(我自己这么理解并称呼的哈哈),根据rule all里得到的结果去反推整个pipeline,这里报错的应该是该pipe上的第一条rule。
2. SyntaxError in line 28 of /path/to/snakefile: invalid syntax
无效语法,在input或者output中输入两行参数时,末尾需要添加英文的逗号","。或者是删除了第二行参数后,第一行参数后面的逗号没有去掉。
3. SyntaxError in line 25 of /path/to/snakefile: Expected name or colon after rule or checkpoint keyword.
rule 名称后面没有加冒号,需要补充
4. RuleException in line 114 of /path/to/snakefile: NameError: The name ‘XXX’ is unknown in this context. Did you mean ‘wildcards.XXX’?
通配符不能直接放到shell命令行中作为输入或者输出,需要在input/output中定义参数,然后使用{input.xxx}或者{output.xxx}的格式对命令行进行填充
5. Wildcards in input files cannot be determined from output files: ‘XXXXXX’
输入的通配符没有使用expand进行定义和延申
#错误:
rule all:
"{genome}.fai",
#正确:
rule all:
expand("{genome}.fai",genome=config["reference"]),
6. Not all output, log and benchmark files of rule trim_galore contain the same wildcards. This is crucial though, in order to avoid that two or more jobs write to the same file.
设置输出文件名或者输出文件的时候格式有问题,比如输出文件夹格式为:
directory()
被我简写成了
dir()
7. RuleException in line 113 of /path/to/snakefile: ‘InputFiles’/‘OutputFiles’ object has no attribute ‘xxx’
113行的命令遇到了输入/输出规则中没有定义的参数‘xxx’,大部分情况是设置的参数和写在命令行的参数不一样,包括且不限于(以输入错误为例):
增删字符
input:
genome = config["reference"]
shell:
"samtools faidx {input.geneome}"(增)
"samtools faidx {input.genom}"(删)
格式有误
input:
genome = config["reference"]
shell:
"samtools faidx {input_genome}"(符号错误)
8.TypeError in line 140 of /path/to/snakefile: shellcmd() takes 2 positional arguments but 3 were given
在一个shell中写了两条命令并使用",“分开。一个rule只能有一条被双引号括住的shell命令,如果非要运行多个命令的话,可以在”“命令内使用”;“补充,或者使用管道符号”|"写成一条长命令
9.NameError in line 156 of /path/to/snakefile: name ‘xxx’ is not defined
你使用了一个snakemake没有定义过的参数,很可能是书写错误。
10. IndentationError in line 165 of : unindent does not match any outer indentation level
格式错误,input或者output前面不够4个空格
总结
本次整理了在snakemake运行前常见的十大问题,之后会出下篇,总结运行过程中可能出现的一些问题,欢迎大家进群讨论。遇见二维码过期可添加VX:bbplayer2021 ,备注 申请加入生信交流群。