R数据结构
1.向量:同一向量中无法混杂不同模式的数据
a<-c(1,2,3,4,5)
b<-c("a","b","c","d")
2.矩阵:二维数组,同样要求数据类型一致
> y<-matrix(1:20,nrow=5,ncol=4)
> y
[,1] [,2] [,3] [,4]
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 13 18
[4,] 4 9 14 19
[5,] 5 10 15 20
3.数组:数组与矩阵类似,但是维度可以大于2,
> dim1<-c("a1","a2")
> dim2<-c("b1","b2","b3")
> dim3<-c("c1","c2","c3","c4")
> z<-array(1:24,c(2,3,4),dimnames=list(dim1,dim2,dim3))
> z
, , c1
b1 b2 b3
a1 1 3 5
a2 2 4 6
, , c2
b1 b2 b3
a1 7 9 11
a2 8 10 12
, , c3
b1 b2 b3
a1 13 15 17
a2 14 16 18
, , c4
b1 b2 b3
a1 19 21 23
a2 20 22 24
4.数据框:可以包含不同类型的数据
> patientid<-c(1,2,3,4)
> age<-c(21,22,33,42)
> diabetes<-c("type1","type2","type3","type4")
> status<-c("poor","improved","good","poor")
> patientdata<-data.frame(patientid,age,diabetes,status)
> patientdata
patientid age diabetes status
1 1 21 type1 poor
2 2 22 type2 improved
3 3 33 type3 good
4 4 42 type4 poor
这里提到R的几个函数:attach(),detach(),with(),在对数据框中的变量进行引用的时候,可以使用这几个函数避免输入数据框名称。
> summary(mtcars$mpg,)
> 和attach(mtcars) summary(mpg)
> 和with(mtcars,{summary(mpg)})是等价的
5.因子:名义型变量和有序型变量在R中称为因子。
#首先,以向量形式输入数据
> patientid<-c(1,2,3,4)
> age<-c(25,34,28,52)
> diabetes<-c("type1","type2","type2","type1")
> status<-c("poor","improved","excellent","poor")
> patientdata<-data.frame(patientid,age,diabetes,status)
#显示对象结构
> str(patientdata)
'data.frame': 4 obs. of 4 variables:
$ patientid: num 1 2 3 4
$ age : num 25 34 28 52
$ diabetes : Factor w/ 2 levels "type1","type2": 1 2 2 1
$ status : Factor w/ 3 levels "excellent","improved",..: 3 2 1 3
#显示对象的统计概要
> summary(patientdata)
patientid age
Min. :1.00 Min. :25.00
1st Qu.:1.75 1st Qu.:27.25
Median :2.50 Median :31.00
Mean :2.50 Mean :34.75
3rd Qu.:3.25 3rd Qu.:38.50
Max. :4.00 Max. :52.00
diabetes status
type1:2 excellent:1
type2:2 improved :1
poor :2
6.列表:列表是R数据类型中最复杂的一种,列表就是一些对象的有序集合,某个列表中可能是若干向量、矩阵、数据框甚至其他列表的组合。
> a<-"this is a story"
> b<-c(1,2,3,5)
> j<-matrix(1:20,nrow=4)
> k<-c("one","two","three")
> mylist<-list(title=a,age=b,j,k)
> mylist
$title
[1] "this is a story"
$age
[1] 1 2 3 5
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20
[[4]]
[1] "one" "two" "three"
> mylist[[2]]
[1] 1 2 3 5
> mylist['age']
$age
[1] 1 2 3 5
> mylist[2]
$age
[1] 1 2 3 5