Functions in R

最新推荐文章于 2020-09-04 20:28:28 发布

牛哥骑驴看马

最新推荐文章于 2020-09-04 20:28:28 发布

阅读量828

点赞数

分类专栏： R 经验文章标签： Advanced R

本文链接：https://blog.csdn.net/u011090052/article/details/39576865

版权

R 经验专栏收录该内容

12 篇文章 0 订阅

订阅专栏

<span style="font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; line-height: 1.25; background-color: rgb(255, 255, 255);">译自：</span>

http://adv-r.had.co.nz/Functions.html

“To understand computations in R, two slogans are helpful:

Everything that exists is an object.
Everything that happens is a function call."

— John Chambers

所有均为对象，包括函数；

函数

函数成分

函数3个基本元素

formals，形参表，可用formals(fun)查看；
body，函数体，可用body(fun)查看；
environment，环境，可用environment(fun)查看；

R函数以上3成分均非空；
Primitive函数，该类型函数直接使用.Primitive()调用同名C函数；

如sum函数；

> sum
function (..., na.rm = FALSE)  .Primitive("sum")

此类函数满足

> formals(sum)
NULL
> body(sum)
NULL
> environment(sum)
NULL

习题答案（自编，仅供参考）

formal <- sapply(funs,FUN=formals);

len <- sapply(formal,FUN=length);

# Q1: which base funtion has the most arguments?
> max(len);
[1] 22
> funs(len==22)
$scan

# Q2: how many base functions has no arguments? What's special baout these functions?
> sum(len==0)
[1] 221

# Q3: how could you adapt the code to find all primitive functions?
funs <- Filter(is.function,objs);
is_primitive <- function(x) 
                {
                  return(is.null(formals(x))&is.null(body(x))&is.null(environment(x)));
                }
primitive_funs <- Filter(is_primitive,funs)
primitive_funs

静态域

四条基本原则：

name masking
functions vs. variables
a fresh start
dynamic lookup

name masking

对于变量或函数，规则一样：

The same rules apply if a function is defined inside another function: look inside the current function, then where that function was defined, and so on, all the way up to the global environment, and then on to other loaded packages. Run the following code in your head, then confirm the output by running the R code.

The same rules apply to closures, functions created by other functions. Closures will be described in more detail in functional programming; here we’ll just look at how they interact with scoping.

<pre name="code" class="plain"># 寻常局部<strong>变量</strong>

x <- 2
g <- function() {
  y <- 1
  c(x, y)
}
g()
rm(x, g)

</pre><pre name="code" class="plain"># <strong>变量包含在嵌套定义函数中</strong> <span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">a function is defined inside another function</span>

x <- 1
h <- function() {
  y <- 2
  i <- function() {
    z <- 3
    c(x, y, z)
  }
  i()
}
h()
rm(x, h)

</pre><pre name="code" class="plain"># <strong>局部定义函数</strong>屏蔽上级函数

l <- function(x) x + 1
m <- function() {
  l <- function(x) x * 2
  l(10)
}
m()
#> [1] 20
rm(l, m)

</pre><pre name="code" class="plain"># <strong>closure</strong>

j <- function(x) {
  y <- 2
  function() {
    c(x, y)
  }
}
k <- j(1) # 返回函数
k() # 调用该函数
rm(j, k)

<span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;"># 注： 可能有点奇怪，调用完j后，返回函数k()是如何知道局部变量y的值？</span><span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">. It works because </span><code style="box-sizing: border-box; font-family: Inconsolata, sans-serif; font-size: 14px; padding: 1px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; line-height: 20px; background-color: rgb(250, 250, 250);"><strong>k</strong></code><span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;"><strong> preserves the environment</strong> in which it was defined #-> and because the environment includes the value of </span><code style="box-sizing: border-box; font-family: Inconsolata, sans-serif; font-size: 14px; padding: 1px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; line-height: 20px; background-color: rgb(250, 250, 250);">y</code><span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">. </span><a target=_blank href="http://adv-r.had.co.nz/Environments.html#environments" style="box-sizing: border-box; color: rgb(66, 139, 202); text-decoration: none; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">Environments</a> <span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">gives some pointers on how you can dive in and figure out what values are #-> stored in the environment associated with each function.</span>

有趣的是，name masking规则可以使我们“重载”各种操作，所以每次最好重启R session

`(` <- function(e1) {
  if (is.numeric(e1) && runif(1) < 0.1) {
    e1 + 1
  } else {
    e1
  }
}
replicate(50, (1 + 2))
#>  [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 4 3 3 3 3 3 4 3 3 3 3 3 3 3 4 3 3 3
#> [36] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
rm("(")

a fresh start
每次调用函数均是开辟一个新的空间。查看以下实例：

j <- function() {
  if (!exists("a")) {
    a <- 1
  } else {
    a <- a + 1
  }
  print(a)
}
j()
rm(j)

此函数每次调用均返回1！因为每调用完一次，资源就会被收回，当次调用不会知道上次调用结果a。

dynamic lookup

f <- function() x
x <- 15
f()
#> [1] 15

x <- 20
f()
#> [1] 20

问题出在： R looks for values when the function is run, not when it’s created.

以上函数f每次行为都跟全局环境中x有关，导致x函数行为不是self-contained。

检测这种错误的方法是

f <- function() x + 1
codetools::findGlobals(f)

解决这种问题的一种极端方法是将每次调用f的新 环境强制定义为空环境；

environment(f) <- emptyenv()
f()
#> Error: could not find function "+"

Every operation is a function call

Note that `(Esc下面那个键，类似于Shell里面eval操作符) , the backtick, lets you refer to functions or variables that have otherwise reserved or illegal names:

x <- 10; y <- 5
x+y 等价于 `+`(x,y)

</pre><pre name="code" class="plain">for (i in 1:2) print(i) 等价于 `for`(i,1:2,print(i))

</pre><pre name="code" class="plain">if(i==1) print("yes") else print("no") 等价于 if(i==1,print("yes"),print("no"))

</pre><pre name="code" class="plain">x[3] 等价于 `[`(x,3)

</pre>比较常用的地方是sapply/lapply等</div><div><span style="color:#333333;"><span style="line-height: 20px;"><span style="font-size: 14px;"></span></span></span><pre name="code" class="plain">add <- function(x, y) x + y
sapply(1:10,add, 3)

等价于

sapply(1:10,`+`,3)

等价于

sapply(1:10,"+",3)

最后一个可行的原因在于<code style="box-sizing: border-box; font-family: Inconsolata, sans-serif; font-size: 14px; padding: 1px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; line-height: 20px; background-color: rgb(250, 250, 250);">lapply() source code</code><span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;">, you’ll see the first line uses </span><code style="box-sizing: border-box; font-family: Inconsolata, sans-serif; font-size: 14px; padding: 1px; color: rgb(51, 51, 51); border-top-left-radius: 4px; border-top-right-radius: 4px; border-bottom-right-radius: 4px; border-bottom-left-radius: 4px; line-height: 20px; background-color: rgb(250, 250, 250);">match.fun()</code><span style="color: rgb(51, 51, 51); font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 20px;"> to find functions given their names.</span>

</pre><h3 style="box-sizing: border-box; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-weight: 500; line-height: 1.1; margin-top: 20px; margin-bottom: 10px; font-size: 24px; color: rgb(51, 51, 51);">Calling a function given a list of arguments</h3>当参数较长时，我们可以调用do.call</div><div><span style="color:#333333;"><span style="line-height: 20px;"><span style="font-size: 14px;"></span></span></span><pre name="code" class="plain">args <- list(1:10, na.rm = TRUE);

do.call(mean,args);

Default and missing arguments

A) missing()函数可以验证是否提供参数

i <- function(a, b) {
  c(missing(a), missing(b))
}
i()
#> [1] TRUE TRUE
i(a = 1)
#> [1] FALSE  TRUE
i(b = 2)
#> [1]  TRUE FALSE
i(1, 2)
#> [1] FALSE FALSE

B) 默认参数可以

1）在函数体中提供；

2）形参表中定为NULL；（推荐）

C) 未知参数可以用

...

表示，使用时必须小心。

Lazy evaluation

所谓“Lazy”是指，函数形参只有在使用的时候才会eval，如，

f <- function(x) {
  10
}
f(stop("This is an error!"))
#> [1] 10

如果想保证x被eval，那么可以使用force()函数，如

f <- function(x)
{
  force(x);
  10;
}

这点在sapply或循环创建多个enclosure时，显得尤其重要

add <- function(x) {
  function(y) x + y
}
adders <- lapply(1:10, add)
adders[[1]](10)
#> [1] 20
adders[[10]](10)
#> [1] 20

因为在使用lapply创建closure时，x没有被eval，只有在第一次调用时，才eval x，此时x值是10；

修改如下：

add <- function(x) {
  force(x); # eval x every time create a closure
  function(y) x + y
}
adders <- lapply(1:10, add)
adders[[1]](10)
#> [1] 11
adders[[10]](10)
#> [1] 20

这种Lazily evaluation的好处有（一般是短路技巧）：

# 1

`&&` <- function(x,y)
{
  if(!x) return(FALSE);
  if(!y) return(FALSE);
  return(TRUE);
}
x <- NULL;
if(!is.null(x)&&x>0) # if and only if x is not NULL
{print("yes")}

# 2
if(is.null(x)) stop("a is null")
等价于
!is.null(x)||stop("x is null")

Special calls

infix operator

一般函数是prefix function，也就是函数名在参数表后面，类似于双元运算符的运算符称为中序运算符。

R规定，所有prefix function必须以%开始，以%结束。如%*%（矩阵相乘）

R关于prefix function的运算方向规定：R’s default precedence rules mean that infix operators are composed from left to right:

# R 预定的prefix function有: 
%%, %*%, %/%, %in%, %o%, %x%. 
# The complete list of built-in infix operators that don’t need % is:
 ::, :::, $, @, ^, *, /, +, -, >, >=, <, <=, ==, !=, !, &, &&, |, ||, ~, <-, <<-

replacement functions

也就是可以改变形参数值的函数，标志为func<-。然而方式不是采用C指针的方式，而是采用复制形参，撤销传参的形式。如，

`second<-` <- function(x,value)
{
  x[2] <- value;
  return(x);
}

调用方法

second(x) <- 3;
等价于
x=`second<-`(x,3);

通过pryr::address()函数可以查看对象内存地址，可以发现replacement functions实际是采用复制的形式间接改变传参值。

> address(x)
[1] "0x15b70688"
> second(x)<-3
> address(x)
[1] "0x144ac9c8"

R语言还有一些内置的replacement function，如[]等，可以作为左值的函数。

Return values

A) 返回值可以是invisible

f1 <- function() return(invisible(1));

> f1()

B) 调用on.exit保证，不管函数时正常还是非正常退出，都可以执行某一操作

in_dir <- function(dir, code) {
  old <- setwd(dir)
  on.exit(setwd(old)) # 而on.exit当且仅当函数退出时才会执行。

  force(code)
}
getwd()
#> [1] "/home/travis/build/hadley/adv-r"
in_dir("~", getwd())
#> [1] "/home/travis"

牛哥骑驴看马

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
Functions in R

“To understand computations in R, two slogans are helpful:Everything that exists is an object.Everything that happens is a function call."— John Chambers
复制链接

扫一扫