-
1、变量合并
①向量--c( ) 函数
a<-c(c(1,2,3),c(3,4,5,5),c(34,3,5))
a## [1] 1 2 3 3 4 5 5 34 3 5
②矩阵/数据框--rbind( )上下拼接;cbind( )左右拼接
a<-matrix(1:6,3)
b<-matrix(20:25,3)
rbind(a,b)## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
## [4,] 20 23
## [5,] 21 24
## [6,] 22 25
cbind(a,b)
## [,1] [,2] [,3] [,4]
## [1,] 1 4 20 23
## [2,] 2 5 21 24
## [3,] 3 6 22 25
③数据框--merge( ):横向合并,两个数据框通过1个or多个共同变量连结
merge(x,y,by= ,by.x= ,by.y= ) 其中by.x= ,by.y= 指明2个数据框里含有相同信息但名称不同的2个变量
data1<-data.frame(name=c('a',"b","c"),weight=c(45,56,64),height=c(165,167,170))
data2<-data.frame(name=c('a',"b","c"),age=c(24,25,26))
merge(data1,data2,by="name")## name weight height age
## 1 a 45 165 24
## 2 b 56 167 25
## 3 c 64 170 26
-
2、列联表
2维以下列联表用table( )
a<-c(2,2,2,4,5)
b<-c(3,2,2,4,7)
d<-c(2,2,2,7,2)
table(a)#单个变量,返回頻数## a
## 2 4 5
## 3 1 1table1<-table(a,d)#两个变量,列联表
table1## d
## a 2 7
## 2 3 0
## 4 0 1
## 5 1 0addmargins(table1) #addmargins()添加边栏,margin可以指明边栏维度
## d
## a 2 7 Sum
## 2 3 0 3
## 4 0 1 1
## 5 1 0 1
## Sum 4 1 5
3维及以上用ftable(……,row.vars= ,col.vars= ),row.vars、col.vars设置列联表的行、列变量
ftable(a,b,d)
## d 2 7
## a b
## 2 2 2 0
## 3 1 0
## 4 0 0
## 7 0 0
## 4 2 0 0
## 3 0 0
## 4 0 1
## 7 0 0
## 5 2 0 0
## 3 0 0
## 4 0 0
## 7 1 0
-
3、reshape2 包
melt( )长数据→宽数据,dcast( )宽数据→长数据
melt(data,id.vars,measure.vars= ,variable.name= ,value.name= )
id.vars参选定记录标识列; measure.vars指明要保存的值; variable.name、value.name选定拉长后的属性名列、选定拉长后的属性值列
(这个函数在使用时要确保每条记录不一样)
library(reshape2)
data1<-data.frame(name=c('a',"b","c"),gender=c('M','F','M'),marriage=c("yes","no","no"),age=c(24,25,27))
data1## name gender marriage age
## 1 a M yes 24
## 2 b F no 25
## 3 c M no 27melt.1<-melt(data1,id.vars="name")
melt.1## name variable value
## 1 a gender M
## 2 b gender F
## 3 c gender M
## 4 a marriage yes
## 5 b marriage no
## 6 c marriage no
## 7 a age 24
## 8 b age 25
## 9 c age 27melt.2<-melt(data1,id.vars = "name",measure.vars ='age')
melt.2## name variable value
## 1 a age 24
## 2 b age 25
## 3 c age 27
dcast(melt.1,name~variable,value.var ="value")#首参选定数据框,次参选定记录标识列和新的属性名列,value.var选定被拉长的属性值列
## name gender marriage age
## 1 a M yes 24
## 2 b F no 25
## 3 c M no 27