实验记录 | 提高运算时间的策略(3)

博主在使用htseq-count进行基因表达计数时遇到问题,反复尝试更换工具、调整参数,重点关注了gtf标签、排序和索引。文章详细记录了错误信息和尝试的解决方案,包括换用不同类型的特征类型和文件排序。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

下定决心一定要处理完这件事,现在总结一下:
(1)再次换用htseq-count进行计数处理,更进一步的研究一些现成的工具的运算的方法,励志要彻底的解决它!!(不能放弃!)

(base) [xxzhang@mu02 chr1]$ htseq-count -f bam result_chr1.bam hg38.gtf >counts2.txt
[E::idx_find_and_load] Could not retrieve index file for 'result_chr1.bam'
  [Errno 2] No such file or directory: 'hg38.gtf'
  [Exception type: FileNotFoundError, raised in utils.py:38]
samtools index -b result_chr1.bam
Warning: No features of type 'exon' found.
Warning: Read A00928:207:HYLCHDSXY:2:1442:26793:7654 claims to have an aligned mate which could not be found in an adjacent line.

奇怪的倒是遇到了和之前一样的问题:

4700000 GFF lines processed.
4800000 GFF lines processed.
4900000 GFF lines processed.
5000000 GFF lines processed.
5100000 GFF lines processed.
5200000 GFF lines processed.
5300000 GFF lines processed.
start too small
  [Exception type: IndexError, raised in _HTSeq.pyx:376]
htseq-count -f bam result_chr1.bam repeatfamily_v3.gtf   >count4.txt                                                                                                                                                          

这个数据又在同一个位置出现了错误,原因依旧不明。我觉得还是标签的问题,或者我未对gtf文件进行排序。真是让人烦恼。

5000000 GFF lines processed.
5100000 GFF lines processed.
5200000 GFF lines processed.
5300000 GFF lines processed.
  start too small
  [Exception type: IndexError, raised in _HTSeq.pyx:376]

到底是什么原因呢?
这一次提前了。

(base) [xxzhang@fat02 hg38]$ htseq-count -f bam result_chr1.bam repeatfamily_v4.gtf >counts3.txt
100000 GFF lines processed.
200000 GFF lines processed.
300000 GFF lines processed.
400000 GFF lines processed.
  start too small
  [Exception type: IndexError, raised in _HTSeq.pyx:376]


尝试把exon改为CDS,看看什么结果。

 htseq-count -f bam -t CDS result_chr1.bam repeatfamily_v5.gtf >counts3.txt

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值