Problem
Quality of the bases can vary depends on position in read due to nature of the sequencing procedure. One can check this quality distribution using "Per Base Sequence Quality" module of the FastQC program.
Average accepted quality values is a 10 for the lower quartile and 25 for median. If the values falls below this limit, then the module returns a warning.
Note that for the reads >50bp long FastQC will group the bases. To show data for every base in the read use "--nogroup" option.
Given: FASTQ file, quality threshold
Return: Number of positions where mean base quality falls below given threshold
由于测序程序的性质,碱基的质量可以取决于阅读位置。可以使用FastQC程序的“每碱基序列质量”模块检查这种质量分布。
低四分位数的平均可接受质量值为10,中位数的平均可接受质量值为25。如果值低于此限制,则模块将返回警告。
请注意,对于大于50bp的读取,FastQC将对碱基进行分组。要显示读取的每个碱基的数据,请使用“ --nogroup”选项。
给定: FASTQ文件,质量阈值
返回值:平均基本质量低于给定阈值的职位数
Sample Dataset
26 @Rosalind_0029 GCCCCAGGGAACCCTCCGACCGAGGATCGT + >?F?@6<C<HF?<85486B;85:8488/2/ @Rosalind_0029 TGTGATGGCTCTCTGAATGGTTCAGGCAGT + @J@H@>B9:B;<D==:<;:,<::?463-,, @Rosalind_0029 CACTCTTACTCCCTAGCCGAACTCCTTTTT + =88;99637@5,4664-65)/?4-2+)$)$ @Rosalind_0029 GATTATGATATCAGTTGGCTCCGAGAGCGT + <@BGE@8C9=B9:B<>>>7?B>7:02+33.
Sample Output
17