文章目录
DGE分析流程
- 理解在RNAseq数据上执行统计分析时的注意事项
- 从基因计数开始(在比对和计数之后)
- 对计数数据进行QC
- 使用DESeq2对计数数据进行差异表达分析,获得差异显著的基因列表
- 分析结果可视化
- 对差异表达基因列表进行功能分析
R语言复习Q&A
学习目标
- 描述R使用的各种数据类型和数据结构(包括tibbles)
- 使用R中的函数并描述如何获取参数帮助
- 描述如何在R中安装和使用包
- 使用dplyr包中的管道(%>%)
- 描述ggplot2用于绘图的语法
Setting up
- Let’s create a new project directory for this review:
- Create a new project called
R_refresher
- Create a new R script called
reviewing_R.R
- Create the following folders in the project directory -
data
,figures
- Download a counts file to the
data
folder by right-clicking here
答:在Rstudio新建Project即可,并保存Rscript
创建文件夹,使用: dir.create('data') dir.create('figures')
- Now that we have our directory structure setup, let’s load our libraries and read in our data:
- Load the
tidyverse
library. - Use
read.csv()
to read in the downloaded file and save it in the object/variable counts- What is the syntax for a function?
- How do we get help for using a function?
- What is the data structure of
counts
?- What main data structures are available in R?
- What are the data types of the columns?
- What data types are available in R?
答:
> library('tidyverse')
> counts <- read.csv('./data/raw_counts_mouseKO.csv')
> head(counts)
KO1
ENSMUSG00000000001 5424.6698
ENSMUSG00000000003 0.0000
ENSMUSG00000000028 689.5080
ENSMUSG00000000031 0.0000
ENSMUSG00000000037 282.2818
ENSMUSG00000000049 0.0000
KO2
ENSMUSG00000000001 5957.837616
ENSMUSG00000000003 0.000000
ENSMUSG00000000028 724.080013
ENSMUSG00000000031 8.581689
ENSMUSG00000000037 217.760359
ENSMUSG00000000049 33.254045
KO3
ENSMUSG00000000001 5451.39192
ENSMUSG00000000003 0.00000
ENSMUSG00000000028 618.82944
ENSMUSG00000000031 0.00000
ENSMUSG00000000037 210.84479
ENSMUSG00000000049 5.27112
KO4
ENSMUSG00000000001 4182.126698
ENSMUSG00000000003 0.00000