R
小林书店副编集
此人不懒,什么都没有写
展开
-
R语言 pivot_longer 图表变换
relig_income#> # A tibble: 18 x 11#> religion `<$10k` `$10-20k` `$20-30k` `$30-40k` `$40-50k` `$50-75k` `$75-100k`#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <原创 2020-07-01 00:17:44 · 6812 阅读 · 0 评论 -
R语言 ggplot2 柱状图
# librarylibrary(ggplot2) # create a datasetspecie <- c(rep("sorgho" , 3) , rep("poacee" , 3) , rep("banana" , 3) , rep("triticum" , 3) )condition <- rep(c("normal" , "stress" , "Nitrogen") , 4)value <- abs(rnorm(12 , 0 , 15))data <- da.原创 2020-07-01 00:12:39 · 926 阅读 · 0 评论 -
R语言 ggplot2 笔记
legend设置legend位置和titleggplot(df, aes(x, y, colour=g)) + geom_line(stat="identity") + theme(legend.position="bottom") + theme(legend.title=element_blank())原创 2020-07-01 00:07:44 · 215 阅读 · 0 评论 -
R语言 dplyr selec 辅助函数
Tidyverse selections implement a dialect of R where operators make it easy to select variables:: for selecting a range of consecutive variables.! for taking the complement of a set of variables.& and | for selecting the intersection or the union of原创 2020-06-30 03:07:15 · 665 阅读 · 0 评论 -
R语言一次性更新全部packages
install.packages( lib = lib <- .libPaths()[1], pkgs = as.data.frame(installed.packages(lib), stringsAsFactors=FALSE)$Package, type = 'source')https://www.r-bloggers.com/update-all-user-installed-r-packages-again/原创 2020-06-29 20:19:07 · 3793 阅读 · 0 评论 -
R语言 glue 版本冲突
namespace ‘glue’ is imported by ‘tidyselect’, ‘dplyr’ so cannot be unloaded报错原因是dplyr和tidyselect两个包所要求的glue版本不同。把glue更新到最新版本即可,可以需要源码编译。原创 2020-06-29 20:10:31 · 572 阅读 · 0 评论 -
R语言 random forests 高性能库
最忠实Leo Breiman算法的版本是 randomForest,但是这个库不支持并行,性能也比较差。有两个优化后的替代版本,都支持并行计算。rangerrborist原创 2020-06-29 19:45:32 · 173 阅读 · 0 评论 -
R语言将所有列数据正交化/缩放
官方例子如下,rescale01 <- function(x) { rng <- range(x, na.rm = TRUE) (x - rng[1]) / (rng[2] - rng[1])}df <- tibble(x = 1:4, y = rnorm(4))df %>% mutate(across(where(is.numeric), rescale01))#> # A tibble: 4 x 2#> x y#>原创 2020-06-29 19:18:02 · 1662 阅读 · 0 评论 -
R语言 random forests out-of-bag prediction
out-of-bag predictionCreated: Jun 29, 2020 12:22 PMUpdated: Jun 29, 2020 12:28 PMhttps://stackoverflow.com/questions/25153276/difference-of-prediction-results-in-random-forest-modelhttps://stats.stackexchange.com/questions/412479/difference-between-the原创 2020-06-29 19:11:55 · 422 阅读 · 0 评论 -
R语言 coalesce 函数
两个主要功能,替换NAx <- c(2, 1, NA, 5, 3, NA) # Create example vectorcoalesce(x, 999) # Apply coalesce function# 2 1 999 5 3 999x中的NA被替换成了999。对比并替换这个功能更有用。y <- c(1, NA, 7, NA, 7, NA) # Create second example vectorco原创 2020-06-29 03:11:39 · 1200 阅读 · 0 评论 -
R语言 case_when 函数
case_when要点有两个不匹配的时候会返回 NA,而不是保持不变根据顺序进行条件判断,顺序很重要下面这段代码,x <- 1:50case_when( x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x))如果不包含TRUE ~ as.character(x),会返回[1] NA NA NA原创 2020-06-29 02:15:12 · 10109 阅读 · 0 评论 -
R语言semi_join()和anti_join()
Filtering joins filter rows from x based on the presence or absence of matches in y:semi_join() return all rows from x with a match in y.anti_join() return all rows from x without a match in y.# "Filtering" joins keep cases from the LHSband_members %.原创 2020-06-28 22:38:15 · 2333 阅读 · 0 评论 -
R语言nest_join()函数
nest_join() returns all rows and columns in x with a new nested-df column that contains all matches from y. When there is no match, the list column is a 0-row tibble.nest_join()类似left_join(),返回的形式不一样。band_members %>% nest_join(band_instruments)#>.原创 2020-06-28 22:33:44 · 667 阅读 · 0 评论 -
R语言行/列合并
bind_rows & bind_cols这两个命令是do.call(rbind, dfs)和do.call(cbind, dfs)的代替,使用起来更有效率。one <- starwars[1:4, ]two <- starwars[9:12, ]bind_rows(one, two)bind_rows(list(one, two)) # a list of dataframesbind_rows(list(one, two), list(two, one))bind_r原创 2020-06-28 22:28:34 · 4290 阅读 · 0 评论 -
summarise() regrouping output 警告
summarise() regrouping output 警告这里讨论的是return的属性,而不是group_by本身。以下面代码为例,summarise之后给出的警告信息是指,这里return的tibble的atrribute中group_by参数只有homeword,没有species。也就是说,返回一个按照homeword这个level进行group_by的tibble,species被drop掉了。如果设置.groups = "drop",返回的是一个不带任何group level的tibb原创 2020-06-15 03:26:16 · 1751 阅读 · 0 评论