二、ChIA-PET2——3、输入数据及报错

最新推荐文章于 2024-04-20 12:06:02 发布

很酷的女超人

最新推荐文章于 2024-04-20 12:06:02 发布

阅读量1k

点赞数 2

分类专栏： ChIA-PET2 系统报错文章标签： linux

本文链接：https://blog.csdn.net/viola_/article/details/117047114

版权

文章目录

一、构建bwa的基因组索引文件
二、找hg19的genome文件
三、在ENCODE下载K562细胞系中，CTCF的ChIA-PET数据
- 1、下载数据
- 2、解压数据
四、跑ChIA-PET
五、报错

ChIA-PET2 -g hg19.fa -b human.hg19.genome -d 1 -f ENCFF000KYG.fastq -r ENCFF000KYK.fastq -o OUTdir12 -n index-1

一、构建bwa的基因组索引文件

1、一步到位下载hg19基因组文件

wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg19/ucsc.hg19*

放入服务器

bwa index [ –p prefix ] [ –a algoType ] <in.db.fasta>#官网提供的命令

bwa index -a bwtsw hg19.fa#实际操作

最后生成文件：hg19.fa.amb、hg19.fa.ann、hg19.fa.bwt、hg19.fa.pac和hg19.fa.sa。
amb是ambiguous的缩写，也就是模棱两可的意思，也就是除了ATCG/atcg以外的字符. amb和ann用来记录基因组中除了ATCG以外碱基的信息。而pac文件则是碱基信息高度压缩。

构建索引时需要注意的问题：bwa构建索引有两种算法

-a bwtsw对于短的参考序列是不工作的，必须要大于等于10Mb
-a is是默认参数，这个参数不适用于大的参考序列，必须要小于等于2G

2、遇到报错

(base) [root@node01 chia-pet1]# ChIA-PET2 -g ucsc.hg19.fasta.pac -b human.hg19.genome -f 
[05-21 11:40:30] Start ChIA-PET2 V0.9.3  ...


[05-21 11:40:30] Running Step 1: Trim Linker ...

trimLinker -t 1 -m 0 -k 0 -e 0 -l 15 -o OUTdir -n index-1 -A GTTGGATAAG -B GTTGGAATGT ENC
thread is 1
mode is 0
keepempty is 0
Error allowed is 0
min length of trimmed read is 15
Output dir is OUTdir
Output name is index-1
linkerA is GTTGGATAAG
linkerB is GTTGGAATGT
Reads 1: ENCFF000KYB.fastq.gz
Reads 2: ENCFF000KYQ.fastq.gz
Processed 2000000 pair reads
（省略一部分）
1Empty PETs:	38719182
2Empty PETs:	3761387
Valid PETs:	142457250

[05-21 12:14:00] Running Step 2: BWA ...

bwa_wrap ucsc.hg19.fasta.pac OUTdir/index-1_1.valid.fastq 1 OUTdir/index-1_1.valid.sam 0
Running BWA on trimmed reads ...
bwa mem -t 1 ucsc.hg19.fasta.pac OUTdir/index-1_1.valid.fastq | samtools view -h -F 2048 
[E::bwa_idx_load_from_disk] fail to locate the index files
Exit: Please check input or Rerun step 2
(base) [root@node01 chia-pet1]# bwa

Program: bwa (alignment via Burrows-Wheeler transformation)
Version: 0.7.17-r1188
Contact: Heng Li <lh3@sanger.ac.uk>

3、解决：

在这里插入图片描述

①将24个参考基因组.fa文件进行合并，手动合称为一个

cat *.fa > hg19.fa

得到hg19.fa文件
在这里插入图片描述

②hg19.fa文件建bwa索引

bwa index -a bwtsw hg19.fa

得到这五个文件

最低0.47元/天解锁文章

很酷的女超人

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
二、ChIA-PET2——3、输入数据及报错

一、构建bwa的基因组索引文件1、一步到位下载hg19基因组文件wget -c ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/hg19/ucsc.hg19*放入服务器bwa index [ –p prefix ] [ –a algoType ] <in.db.fasta>#官网提供的命令bwa index -a bwtsw hg19.fa#实际操作最后生成文件：hg19.fa.amb、hg19.fa.ann、hg
复制链接

扫一扫