R语言中的dplyr包

介绍

dplyr是一个常用的用于数据清洗的R包,其中主要的函数有:

  • select() 从数据中选择列
  • filter() 数据行的子集
  • group_by() 汇总数据
  • summarise() 汇总数据(计算汇总统计信息)
  • arrange() 排序数据
  • mutate() 创建新变量

mutate()的使用方法

mutate(df, new_variable=existing_var的表达式,.keep = c("all", "used", "unused", "none"),  .before = NULL,  .after = NULL)

参数介绍:
df: 需要修改的数据框
new_variable: 新变量的名称
.keep: This is an experimental argument that allows you to control which columns from .data are retained in the output:

  • “all”, the default, retains all variables.
  • “used” keeps any variables used to make new variables; it’s useful for checking your work as it displays inputs and outputs side-by-side.
  • “unused” keeps only existing variables not used to make new
    variables.
  • “none”, only keeps grouping keys (like transmute()).
    Grouping variables are always kept, unconditional to .keep.
    .before, .after Optionally, control where new columns should appear (the default is to add to the right hand side).

实例


# By default, new columns are placed on the far right.
# Experimental: you can override with `.before` or `.after`
df <- tibble(x = 1, y = 2)
df %>% mutate(z = x + y)
# # A tibble: 1 x 3
#         x     y     z
#       <dbl> <dbl> <dbl>
#   1     1     2     3

df %>% mutate(z = x + y, .before = 1)
# # A tibble: 1 x 3
#         z     x     y
#       <dbl> <dbl> <dbl>
#   1     3     1     2

df %>% mutate(z = x + y, .after = x)
# # A tibble: 1 x 3
#         x     z     y
#       <dbl> <dbl> <dbl>
#   1     1     3     2
# By default, new columns are placed on the far right.
# Experimental: you can override with `.before` or `.after`
df <- tibble(x = 1, y = 2)
df %>% mutate(z = x + y)
# # A tibble: 1 x 3
#         x     y     z
#       <dbl> <dbl> <dbl>
#   1     1     2     3

df %>% mutate(z = x + y, .before = 1)
# # A tibble: 1 x 3
#         z     x     y
#       <dbl> <dbl> <dbl>
#   1     3     1     2

df %>% mutate(z = x + y, .after = x)
# # A tibble: 1 x 3
#         x     z     y
#       <dbl> <dbl> <dbl>
#   1     1     3     2

# By default, mutate() keeps all columns from the input data.
# Experimental: You can override with `.keep`
df <- tibble(x = 1, y = 2, a = "a", b = "b")
df %>% mutate(z = x + y, .keep = "all") # the default
# # A tibble: 1 x 5
#         x     y      a     b     z
#        <dbl> <dbl> <chr> <chr> <dbl>
#   1     1     2      a     b    3

df %>% mutate(z = x + y, .keep = "used")
# # A tibble: 1 x 3
#         x     y     z
#       <dbl> <dbl> <dbl>
#   1     1     2     3


df %>% mutate(z = x + y, .keep = "unused")
# # A tibble: 1 x 3
#        a     b     z
#       <chr> <chr> <dbl>
#   1    a     b     3


df %>% mutate(z = x + y, .keep = "none") # same as transmute()
# # A tibble: 1 x 1
#         z
#       <dbl>
#   1     3
  • 1
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值