STA3050 Lec7笔记
选取
> d1[d1$age<25,]
分类以及排序
order() 返回向量排序
> order(d1$age) # order of age in d1
[1] 5 3 4 6 1 8 11 10 7 2 9
> d1[order(d1$age, d1$acc_no),] #age是第一关键词,acc_no是第二关键词
> rank(d1$age)
[1] 4.5 10.0 2.0 3.0 1.0 4.5 9.0 6.5 11.0 8.0 6.5
匹配
x %in% y returns a logical vector indicating the elements in x which are also available in y.
e.g.
> 1:8 %in% 5:10
[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
!( x %in% y ) will return a logical vector whose elements in x but not in y.
> !(1:8 %in% 5:10)
[1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE
筛选共有的并组合
> merge(d1,d2,by="name")#默认all=F
name acc_no sex age save check card loan
1 AliceChan 87441 f 23 5463 436 1 1
2 Boris Lee 96205 m 30 23520 3464 1 0
3 David Wong 41692 m 21 23430 546 1 1
如果不筛选直接组合所有的
> (d3<-merge(d1,d2,by="name",all=T))
查找重复
> d4[duplicated(d4$name),]
查找非重复
> (d5<-d4[!duplicated(d4$name),])
划分字符串
name<-strsplit(name," ")
字母大写
(last<-toupper(name[,2]))
替换函数
d3<-replace(d,d>2,NA) d: 替换前,d>2: 条件,NA: 替换后
# replace all values > 2 to NA
等距划分函数
f<-cut(d,breaks=c(-Inf,-2:2,Inf),labels=c("A","B","C","D","E","F"))
#把d按照(‐∞, ‐2], (‐2, ‐1], (‐1, 0], (0, 1], (1, 2], (2, ∞] 分为A, B, C, D, E, F