使用的测试数据均来自文章: The transcriptional repressors VAL1 and VAL2 mediate genome-wide recruitment of the CHD3 chromatin remodeler PICKLE in Arabidopsis
https://academic.oup.com/plcell/article/34/10/3915/6648467?login=false
GEO accession: GSE186157 (https://www.ncbi.nlm.nih.gov/Traces/study/?acc=PRJNA772752&o=acc_s%3Aa)
登录远程服务器:
ssh <name>@compute.biology.uwo.ca
1. 下载数据:sra-toolkit
1.1 安装sra-toolskit
- Fetch the tar file from the canonical location at NCBI:
wget --output-document sratoolkit.tar.gz https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-ubuntu64.tar.gz
- Extract the contents of the tar file:
tar -vxzf sratoolkit.tar.gz
- For convenience (and to show you where the binaries are) append the path to the binaries to your PATH environment variable:
export PATH=$PATH:$PWD/sratoolkit.3.0.0-mac64/bin
- Verify that the binaries will be found by the shell:
which fastq-dump
/usr/bin/fastq-dump
1.2 下载rawdata
fastq-dump --gzip --split-files --split-3 SRR16487406 -O ChIP-seq_analysis/rawdata/
fastq-dump --gzip --split-files --split-3 SRR16487408 -O ChIP-seq_analysis/rawdata/
fastq-dump --gzip --split-files --split-3 SRR16487410 -O ChIP-seq_analysis/rawdata/
fastq-dump --gzip --split-files --split-3 SRR16487412 -O ChIP-seq_analysis/rawdata/
fastq-dump --gzip --split-files --split-3 SRR16487415 -O ChIP-seq_analysis/rawdata/
fastq-dump --gzip --split-files --split-3 SRR16487418 -O ChIP-seq_analysis/rawdata/
#展示下载结果
cd ChIP-seq_analysis/rawdata
ls -hl
fastp --thread 6 --report_title fastp report for /mnt/d/IELTS/1/SRR2637925 -i /mnt/d/IELTS/1/SRR2637925.gz -o /mnt/d/IELTS/1/fastp_SRR2637925.gz -q 5 -u 50 -n 15
#重命名
mv SRR16487406_1.fastq.gz PKL-GFP-INPUT_1.fastq.gz
mv SRR16487406_2.fastq.gz PKL-GFP-INPUT_2.fastq.gz
mv SRR16487408_1.fastq.gz PKL-GFP-IP-Rep1_1.fastq.gz
mv SRR16487408_2.fastq.gz PKL-GFP-IP-Rep1_2.fastq.gz
mv SRR16487410_1.fastq.gz PKL-GFP-IP-Rep2_1.fastq.gz
mv SRR16487410_2.fastq.gz PKL-GFP-IP-Rep2_2.fastq.gz
mv SRR16487412_1.fastq.gz PKL-GFP-val1_val2-INPUT_1.fastq.gz
mv SRR16487412_2.fastq.gz PKL-GFP-val1_val2-INPUT_2.fastq.gz
mv SRR16487415_1.fastq.gz PKL-GFP-val1_val2-IP-Rep1_1.fastq.gz
mv SRR16487415_2.fastq.gz PKL-GFP-val1_val2-IP-Rep1_2.fastq.gz
mv SRR16487418_1.fastq.gz PKL-GFP-val1_val2-IP-Rep2_1.fastq.gz
mv SRR16487418_2.fastq.gz PKL-GFP-val1_val2-IP-Rep2_2.fastq.gz
2. Mapping to genome: bowtie2
2.1 下载索引
可以直接从官网下载索引,若是没有,也可以自己创制
https://bowtie-bio.sourceforge.net/bowtie2/index.shtml
拟南芥索引链接:A. thaliana, TAIR10, Ensembl
https://genome-idx.s3.amazonaws.com/bt/TAIR10.zip
#下载TAIR10索引
wget https://genome-idx.s3.amazonaws.com/bt/TAIR10.zip
#解压缩
unzip TAIR10.zip
自己创制索引
bowtie2-build --threads 10 -f genome.fa Arabidopsis
#genome.fa: 基因组文件
#Arabidopsis: 创制的索引的名称
2.2 Mapping
bowtie2 -p 6 -x ChIP-seq_analysis/bowtie2/index/TAIR10/TAIR10 -1 ChIP-seq_analysis/rawdata/PKL-GFP-INPUT_1.fastq.gz -2 ChIP-seq_analysis/rawdata/PKL-GFP-INPUT_2.fastq.gz -S ChIP-seq_analysis/bowtie2/PKL-GFP-INPUT.sam 2>ChIP-seq_analysis/bowtie2/PKL-GFP-INPUT.txt
bowtie2 -p 6 -x ChIP-seq_analysis/bowtie2/index/TAIR10/TAIR10 -1 ChIP-seq_analysis/rawdata/PKL-GFP-IP-Rep1_1.fastq.gz -2 ChIP-seq_analysis/rawdata/PKL-GFP-IP-Rep1_2.fastq.gz -S ChIP-seq_analysis/bowtie2/PKL-GFP-IP-Rep1.sam 2>ChIP-seq_analysis/bowtie2/PKL-GFP-IP-Rep1.txt
bowtie2 -p 6 -x ChIP-seq_analysis/bowtie2/index/TAIR10/TAIR10 -1 ChIP-seq_analysis/rawdata/PKL-GFP-IP-Rep2_1.fastq.gz -2 ChIP-seq_analysis/rawdata/PKL-GFP-IP-Rep2_2.fastq.gz -S ChIP-seq_analysis/bowtie2/PKL-GFP-IP-Rep2.sam 2>ChIP-seq_analysis/bowtie2/PKL-GFP-IP-Rep2.txt
bowtie2 -p 6 -x ChIP-seq_analysis/bowtie2/index/TAIR10/TAIR10 -1 ChIP-seq_analysis/rawdata/PKL-GFP-val1_val2-INPUT_1.fastq.gz -2 ChIP-seq_analysis/rawdata/PKL-GFP-val1_val2-INPUT_1.fastq.gz -S ChIP-seq_analysis/bowtie2/PKL-GFP-val1_val2-INPUT.sam 2>ChIP-seq_analysis/bowtie2/PKL-GFP-val1_val2-INPUT.txt
bowtie2 -p 6 -x ChIP-seq_analysis/bowtie2/index/TAIR10/TAIR10 -1 ChIP-seq_analysis/rawdata/PKL-GFP-val1_val2-IP-Rep1_1.fastq.gz -2 ChIP-seq_analysis/rawdata/PKL-GFP-val1_val2-IP-Rep1_2.fastq.gz -S ChIP-seq_analysis/bowtie2/PKL-GFP-val1_val2-IP-Rep1.sam 2>ChIP-seq_analysis/bowtie2/PKL-GFP-val1_val