数据存储
alevin-fry data:
/nfs_data/lidx/project/rna_velocity/fry_a/data/SRR9201794_1.fastq.gz
/nfs_data/lidx/project/rna_velocity/fry_a/data/SRR9201794_2.fastq.gz
An introduction to RNA-velocity using alevin-fry (combine-lab.github.io)
使用数据存储:
/nfs_data/lidx/project/rna_velocity/NB/data/SRR10156295_R1_fastq.gz
/nfs_data/lidx/project/rna_velocity/NB/data/SRR10156295_R1_fastq.gz
GEO Accession viewer (nih.gov)
# 代码
# download data
cd project/rna_velocity/NB/data
prefetch SRR10156295
fastq-dump --split-files --gzip SRR10156295
mv SRR10156295_2.fastq.gz SRR10156295_R1_fastq.gz
mv SRR10156295_3.fastq.gz SRR10156295_R2_fastq.gz
# get ref_genome
mkdir ref_data
cd /nfs_data/database/ref_genomes/human_GRCh38p13/ens107/
cp Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz /nfs_data/lidx/project/rna_velocity/NB/data/ref_data
cp Homo_sapiens.GRCh38.107.gtf.gz /nfs_data/lidx/project/rna_velocity/NB/data/ref_data
cd/nfs_data/lidx/project/rna_velocity/NB/data/ref_data
gunzip Homo_sapiens.GRCh38.107.gtf.gz
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
##107报错,选了一个旧的版本
wget ftp://ftp.ensembl.org/pub/release-93/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
wget ftp://ftp.ensembl.org/pub/release-93/gtf/homo_sapiens/Homo_sapiens.GRCh38.93.gtf.gz
gunzip Homo_sapiens.GRCh38.93.gtf.gz
# making splici refernce using
cd ..
pyroe make-splici \
ref_data/Homo_sapiens.GRCh38.dna.primary_assembly.fa \
ref_data/Homo_sapiens.GRCh38.107.gtf \
151 GRCh38_splici_fl146 \
--flank-trim-length 5 --filename-prefix splici
####107报错,换了93版本
# make gene ID to gene name mapping file
gffread ref_data/Homo_sapiens.GRCh38.93.gtf -o ref_data/Homo_sapiens.GRCh38.93.gff
grep "gene_name" ref_data/Homo_sapiens.GRCh38.93.gff | cut -f9 | cut -d';' -f2,3 | sed 's/=/ /g' | sed 's/;/ /g' | cut -d' ' -f2,4 | sort | uniq > GRCh38_geneid_to_name.txt
# Building the salmon/alevin splici index
cd ..
salmon index -t data/GRCh38_splici_fl146/splici_fl146.fa -i GRCh38_splici_fl146_idx -p 16
# Mapping the data to obtain a RAD file
$ salmon alevin -i GRCh38_splici_fl146_idx -p 16 -l ISR --chromium --sketch \
-1 data/SRR10156295_R1.fastq.gz \
-2 data/SRR10156295_R2.fastq.gz \
-o NB_map
# Processing the mapped reads
$ alevin-fry generate-permit-list -d fw -k -i NB_map -o NB_quant
$ alevin-fry collate -t 16 -i NB_quant -r NB_map
$ alevin-fry quant -t 16 -i NB_quant -o NB_quant_res --tg-map data/GRCh38_splici_fl146/splici_fl146_t2g_3col.tsv --resolution cr-like --use-mtx
$ alevin-fry generate-permit-list -d fw -k -i NB_map -o NB_quant1
$ alevin-fry collate -t 16 -i NB_quant1 -r NB_map
$ alevin-fry quant -t 16 -i NB_quant1 -o NB_quant1_res --tg-map data/GRCh38_splici_fl146/splici_fl146_t2g_3col.tsv --resolution cr-like --use-mtx
# 解释
alevin-fry数据库
![](https://i-blog.csdnimg.cn/blog_migrate/7ee1de5c6293623c018f5b83c75bf9a1.png)
![](https://i-blog.csdnimg.cn/blog_migrate/678b87288c7b21aa2c59594bb4dcf7ee.png)
![](https://i-blog.csdnimg.cn/blog_migrate/d0add13e7df6a782f1b30382f7d398e4.png)
![](https://i-blog.csdnimg.cn/blog_migrate/8da9f6413a63a4d092674749cbecd4bc.png)
![](https://i-blog.csdnimg.cn/blog_migrate/4757366607c113813259ce4063fd06c0.png)
使用的数据
![](https://i-blog.csdnimg.cn/blog_migrate/071f652feb4bca95f59bff4760f4c2bb.png)
![](https://i-blog.csdnimg.cn/blog_migrate/0c024b39f8dd0dc2fb12765bdc454c97.png)
![](https://i-blog.csdnimg.cn/blog_migrate/946977be1e1580d1498cfd7d7f64036a.png)
![](https://i-blog.csdnimg.cn/blog_migrate/f51751dea56d26ebd05eea26f8f3bac7.png)
结果
![](https://i-blog.csdnimg.cn/blog_migrate/837114b187b73de29c2049ed0b608e78.png)
根据cellranger应该为4066
![](https://i-blog.csdnimg.cn/blog_migrate/072a1d7a57ff7b2ed01ff8fc96ae75ec.png)
![](https://i-blog.csdnimg.cn/blog_migrate/a20c0e0e462e8e490c90c39c70581e86.png)
![](https://i-blog.csdnimg.cn/blog_migrate/00722b466ad59ad8073e6eeb6915df4c.png)