R for Data Science总结之——Iteration

本文是关于R for Data Science中的迭代操作总结,包括map函数的使用,Base R中的apply家族,错误处理,多输入map函数,invoke祈唤法,walk函数及其在副作用中的应用,以及各种断言函数的介绍,如keep和discard。文章通过实例展示了如何在R中优雅地进行数据处理和迭代操作。
摘要由CSDN通过智能技术生成

R for Data Science总结之——Iteration

不想多说了,直接上代码

library(tidyverse)
df <- tibble(
  a = rnorm(10),
  b = rnorm(10),
  c = rnorm(10),
  d = rnorm(10)
)
median(df$a)
#> [1] -0.246
median(df$b)
#> [1] -0.287
median(df$c)
#> [1] -0.0567
median(df$d)
#> [1] 0.144

使用for循环的写法:

output <- vector("double", ncol(df))  # 1. output
for (i in seq_along(df)) {            # 2. sequence
  output[[i]] <- median(df[[i]])      # 3. body
}
output

最常见的写法为:

for (i in seq_along(x)) {
  name <- names(x)[[i]]
  value <- x[[i]]
}

如果不清楚要输出向量的长度,则用以下方法表示:

means <- c(0, 1, 2)

output <- double()
for (i in seq_along(means)) {
  n <- sample(100, 1)
  output <- c(output, rnorm(n, means[[i]]))
}
str(output)
#>  num [1:202] 0.912 0.205 2.584 -0.789 0.588 ...

更好的解决办法是将其存放在一个list中:

out <- vector("list", length(means))
for (i in seq_along(means)) {
  n <- sample(100, 1)
  out[[i]] <- rnorm(n, means[[i]])
}
str(out)
#> List of 3
#>  $ : num [1:83] 0.367 1.13 -0.941 0.218 1.415 ...
#>  $ : num [1:21] -0.485 -0.425 2.937 1.688 1.324 ...
#>  $ : num [1:40] 2.34 1.59 2.93 3.84 1.3 ...
str(unlist(out))
#>  num [1:144] 0.367 1.13 -0.941 0.218 1.415 ...

map函数

map_dbl(df, mean)
#>       a       b       c       d 
#>  0.2026 -0.2068  0.1275 -0.0917
map_dbl(df, median)
#>      a      b      c      d 
#>  0.237 -0.218  0.254 -0.133
map_dbl(df, sd)
#>     a     b     c     d 
#> 0.796 0.759 1.164 1.062

同样可在map中写匿名函数,甚至可以直接写字符串得到对应的数值或写整数得到对应位置的数值:

models <- mtcars %>% 
  split(.$cyl) %>% 
  map(function(df) lm(mpg ~ wt, data = df))
models <- mtcars %>% 
  split(.$cyl) %>% 
  map(~lm(mpg ~ wt, data = .))
models %>% 
  map(summary) %>% 
  map_dbl(~.$r.squared)
#>     4     6     8 
#> 0.509 0.465 0.423

models %>% 
  map(summary) %>% 
  map_dbl("r.squared")
#>     4     6     8 
#> 0.509 0.465 0.423

x <- list(list(1, 2, 3), list(4, 5, 6), list(7, 8, 9))
x %>% map_dbl(2)
#> [1] 2 5 8
What exactly is data science? With this book, you’ll gain a clear understanding of this discipline for discovering natural laws in the structure of data. Along the way, you’ll learn how to use the versatile R programming language for data analysis. Whenever you measure the same thing twice, you get two results—as long as you measure precisely enough. This phenomenon creates uncertainty and opportunity. Author Garrett Grolemund, Master Instructor at RStudio, shows you how data science can help you work with the uncertainty and capture the opportunities. You’ll learn about: Data Wrangling—how to manipulate datasets to reveal new information Data Visualization—how to create graphs and other visualizations Exploratory Data Analysis—how to find evidence of relationships in your measurements Modelling—how to derive insights and predictions from your data Inference—how to avoid being fooled by data analyses that cannot provide foolproof results Through the course of the book, you’ll also learn about the statistical worldview, a way of seeing the world that permits understanding in the face of uncertainty, and simplicity in the face of complexity. Table of Contents Part I. Explore Chapter 1. Data Visualization with ggplot2 Chapter 2. Workflow: Basics Chapter 3. Data Transformation with dplyr Chapter 4. Workflow: Scripts Chapter 5. Exploratory Data Analysis Chapter 6. Workflow: Projects Part II. Wrangle Chapter 7. Tibbles with tibble Chapter 8. Data Import with readr Chapter 9. Tidy Data with tidyr Chapter 10. Relational Data with dplyr Chapter 11. Strings with stringr Chapter 12. Factors with forcats Chapter 13. Dates and Times with lubridate Part III. Program Chapter 14. Pipes with magrittr Chapter 15. Functions Chapter 16. Vectors Chapter 17. Iteration with purrr Part IV. Model Chapter 18. Model Basics with modelr Chapter 19. Model Building Chapter 20. Many Models with purrr and broom Part V. Communicate Chapter 21. R Markdown Chapter 22. Graphics for Communication with ggplot2 Chapter 23. R Markdown Formats Chapter 24. R Markdown Workflow
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值