lapply: Loop over a list and evaluate a function on each element
sapply:Same as lapply but try to simplify the result
apply: Apply a function over the margins of an array
tapply: Apply a function over subsets of a vector
mapply: Multivariate version of lapply
An auxiliary function split is also useful, particularly in conjunction with lapply. 把对象分成块
> lapply 的函数定义
function (X, FUN, ...)
## x is a list called x. If x isnot a list, it will convery to a list by system, if convery failed, errror note will exist.
## FUN is function name
##...用来给函数,即对列表里每个元素做运算的那个函数传递参数
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(lapply(X, FUN)) ## 剩余的lapply函数在内部是用C代码实现的
}
<bytecode: 0x101811918>
<environment: namespace:base>
lapply always returns a list, regardless of the class of the input.
> x<-list(a=1:5, b=rnorm(10))
> lapply(x,mean)
$a
[1] 3
$b
[1] 0.1977355
> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))
> lapply(x, mean)
$a
[1] 2.5
$b
[1] -0.07585927
$c
[1] 1.283958
$d
[1] 5.023476
> x <- 1:4
> lapply(x, runif) ## runif:用随机数发生器生成符合均匀分布的随机变量
[[1]]
[1] 0.288356
[[2]]
[1] 0.67348216 0.02642399
[[3]]
[1] 0.3308120 0.2115484 0.8426939
[[4]]
[1] 0.3277662 0.7458162 0.3419563 0.2249834
> x <- 1:4
> lapply(x, runif, min =0, max =10) ## runif:用随机数发生器生成符合均匀分布的随机变量,且这些变量介于0和10之间
[[1]]
[1] 6.72062
[[2]]
[1] 8.231884 4.144748
[[3]]
[1] 8.6574562 0.7239689 9.9948855
[[4]]
[1] 3.95653661 0.02583705 0.72174981 2.42729541
lapply and friends make heavy use of anonymous functions.
> x <- list(a = matrix(1:4, 2, 2), b = matrix(1:6, 3, 2)) ##a:1~4, 2行2列矩阵;b:1~6, 3行2列矩阵
> x
$a
[,1] [,2]
[1,] 1 3
[2,] 2 4
$b
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
> lapply(x, function(elt) elt[,1])
## elt是匿名函数,函数的意思是取矩阵的第一列。lapply让elt在list x中的每个矩阵,都被取出来第一列,然后返回由x中每个矩阵中的第一列组成的一个list。elt不会被保存,lapply()调用玩之后,这个函数消失。
$a
[1] 1 2
$b
[1] 1 2 3
sapply will try to simplify the result of lapply if possible.
- If the result is a list where every element is length 1, then a
vector is returned - If the result is a list where every element is a vector of the same
length(>1), a maxtrix is returned - If it can’t figure things out, a list is return.
> x <- list(a = 1:4, b = rnorm(10), c = rnorm(20, 1), d = rnorm(100, 5))
> sapply(x, mean) ## return a vector
a b c d
2.500000 -0.520970 1.701988 4.964306
applay is used to a evaluate a function (often an anonymous one) over the margins of an array.
- It is most often used to apply a function to the rows of columns of a
matrix - It can be used with general arrays, e.g. taking the average of an
array of martices - It is not really faster than writing a loop, but it works in one
line!
> str(apply) ##读取函数的参数
function (X, MARGIN, FUN, ...)
- X is an array
- MARGIN is an integer vector indicating which margins should be
“retained”. - FUN is a function to be applied
- …is for other arguments to be passed fo FUN
> x <- matrix(rnorm(200), 20, 10)
##创建20x10的矩阵,其元素是正态随机变量
> apply(x, 2, mean)
##计算矩阵中每列的平均数
[1] -0.04187182 0.36853681 -0.08029728 0.09478346 -0.09942189 -0.21843602 -0.11634147 -0.19535512
[9] 0.08166842 0.47675780
> apply(x, 1, sum)
##计算矩阵中每行的合计数
[1] -2.6061289 5.7889316 -1.0950565 -0.5829930 -3.5018040 -0.5822774 5.1054791 -6.8619973 5.0680075
[10] 2.5430657 -0.7463306 -0.1639922 2.2226389 0.4509331 -2.1902933 1.6544674 -0.5925460 -2.7334234
[19] 2.9388101 1.2849675
For sums and means of matrix dimensions, we have some shortcuts.
- rowSums = apply(x, 1, sum)
- rowMeans = apply(x, 1, mean)
- colSums = apply(x, 2, sum)
- colMeans = apply (x, 2, mean)
The shortcut functions are much faster, but you won’t notice unless you’re using a large matrix.
> x <- matrix(rnorm(200), 20, 10)
> apply(x, 1, quantile, probs = c(0.25, 0.75))
##遍历矩阵每行,计算那一行的25和75百分位数
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
25% -0.6078459 -0.78482591 -0.2537456 -0.3311409 -0.7486597 -0.4537818 -0.6678381 -0.9681971 0.4684376
75% 1.1202534 -0.04196072 0.5168813 0.1719272 0.4945426 0.4936025 0.1903076 0.2144089 0.9617262
[,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
25% -0.8172259 -0.6653447 -0.971877223 -0.8423170 -0.8755723 -1.0941386 -0.8998137 -0.6200158 -0.6956639
75% 0.9584867 0.7489272 -0.009365288 0.5974648 0.8550097 -0.3612995 0.1499745 0.5341661 0.6265623
[,19] [,20]
25% -1.41017716 -1.0567091
75% 0.03279747 0.2381334
> a <- array(rnorm(2 *2*10), c(2,2,10)) ##创建正态随机变量的数组,2x2,第三维度是10
> apply(a, c(1,2),mean)
[,1] [,2]
[1,] -0.5117031 0.3828905
[2,] 0.3234120 -0.1570178
> rowMeans(a, dim =2)
[,1] [,2]
[1,] -0.5117031 0.3828905
[2,] 0.3234120 -0.1570178