bam文件处理转fq

最新推荐文章于 2024-07-27 19:10:46 发布

TIME_@

最新推荐文章于 2024-07-27 19:10:46 发布

阅读量1.4k

点赞数

分类专栏：生物信息

本文链接：https://blog.csdn.net/geekfocus/article/details/117953700

版权

生物信息专栏收录该内容

50 篇文章 28 订阅

订阅专栏

原始 BAM 文件和 sort 之后 BAM 文件的行数，是一样的。
SEQanswers：BAM is compressed. Sorting helps to give a better compression ratio because similar sequences are grouped together.

bam转回fq时报错: github查找同问题结果1 2 3

 *****WARNING: Query 17 is marked as paired, but its mate does not occur next to it in your BAM file.  Skipping.
*****WARNING: Query 13 is marked as paired, but its mate does not occur next to it in your BAM file.  Skipping.
*****WARNING: Query 223 is marked as paired, but its mate does not occur next to it in your BAM file.  Skipping.

sort -n 后 warning行数变少1,333,109,095变为11,212,985
？？？：

nohup samtools sort -n S1_T_SRR1273943.bam -o ./S1_T_SRR1273943.sortedByName.bam >log.S1.bam.sortbyname 2>&1 &

nohup bedtools bamtofastq  -i S1_T_SRR1273943.sortedByName.bam -fq S1_T_1.fq -fq2 S1_T_2.fq > log_S1_sortedbyname 2>&1 &

用sortbam得到的fq后续分析继续报错

reads: 0 |ERROR: The mate1 read name did not match the mate2 read name. Resynchr onization support needs to be implemented.

？暂时解决方法，提取序列之后可以按照read name排序，然后提取。为何对于排序后的用bedtools bamtofastq得到的结果会后续报错，而samtools fastq暂时没有

samtools sort -n  bam -o sorted.bam| samtools fastq -1 read_1.fq -2 read_2.fq -s singleton.fq -

samtools fastq

一般而言BAM文件都是按照位置信息排序，想要找到配对的reads，要么是根据read的编号进行排序（这个方法要求额外的内存和存储空间），或者就是在提取的时候记录当前的read的ID，再找到另一端ID后释放内存空间。

BAM中reads名称和fq中reads名称差异，mate1 read name----mate2 read name？

TIME_@

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

bam文件处理 转fq

BAM中reads名称和fq中reads名称差异，mate1 read name----mate2 read name？

bam文件处理转fq