用warning信息为例,刷一波字符串技巧

其实本来是备课转录组下游分析,安装一个包,出现了一摞error,最后是一摞整齐的warning。

non-zero exit status是很常见的一个问题,解决办法是到安装路径,安装路径下每个包是一个文件夹,把报错里的包对应的文件夹找出来删掉,重新运行安装代码即可。
之前写过一篇讲这个问题:
mac让你找不到路径

氮素,我最近复盘了我的R语言基础讲义,tidyverse里的stringr包是缺少练习题的,刚好今天这个可以拿来用了。

题目

很简单,从上面截图的warning信息里提取出所有包的名字。

做法

step1:赋值

我们把这段直接复制过来的warning作为一个长字符串,没有问题。对于字符串来说,单双引号没有特别本质的区别,但是当出现嵌套的时候就不行咯,这是第一个坑。

如果用双引号会发现:
一片报错哗啦啦,配色还有点不对劲。

这是因为这个长字符串内部含有双引号。好在没有单引号,所以我们可以给整个字符串两端加单引号,就可以成功赋值。

x='3: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘BiocParallel’ had non-zero exit status
4: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘feather’ had non-zero exit status
5: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘geometry’ had non-zero exit status
6: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘ggraph’ had non-zero exit status
7: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘RcppArmadillo’ had non-zero exit status
8: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘vegan’ had non-zero exit status
9: In install.packages(update[instlib == l, "Package"], l, repos = repos,  :
  installation of package ‘zip’ had non-zero exit status'
step2:长字符串分割

用空格将整个长字符串分割为单词。

if(!require(stringr))install.packages(stringr)
library(stringr)
str_split(x," ") %>% 
  unlist() %>% 
  head(50)
# [1] "3:"                              "In"                             
# [3] "install.packages(update[instlib" "=="                             
# [5] "l,"                              "\"Package\"],"                  
# [7] "l,"                              "repos"                          
# [9] "="                               "repos,"                         
# [11] ""                                ":\n"                            
# [13] ""                                "installation"                   
# [15] "of"                              "package"                        
# [17] "‘BiocParallel’"                  "had"                            
# [19] "non-zero"                        "exit"                           
# [21] "status\n4:"                      "In"                             
# [23] "install.packages(update[instlib" "=="                             
# [25] "l,"                              "\"Package\"],"                  
# [27] "l,"                              "repos"                          
# [29] "="                               "repos,"                         
# [31] ""                                ":\n"                            
# [33] ""                                "installation"                   
# [35] "of"                              "package"                        
# [37] "‘feather’"                       "had"                            
# [39] "non-zero"                        "exit"                           
# [41] "status\n5:"                      "In"                             
# [43] "install.packages(update[instlib" "=="                             
# [45] "l,"                              "\"Package\"],"                  
# [47] "l,"                              "repos"                          
# [49] "="                               "repos," 
step3:提取包名

所有的包名有个共同点,被两个中文单引号包围,匹配模式可以写成^‘,即以单引号开头。

str_split(x," ") %>% 
  unlist() %>% 
  str_subset("^‘")
#[1] "‘BiocParallel’"  "‘feather’"       "‘geometry’"      "‘ggraph’"       
#[5] "‘RcppArmadillo’" "‘vegan’"         "‘zip’"     

强迫症还需要做一件事,就是去掉单引号。

step4:去掉单引号

用到的函数是str_replace_all,因为str_replace默认只替换匹配到的第一个字符。匹配模式[’‘]表示前后单引号都可以,替换为空字符串""就是删除咯。

str_split(x," ") %>% 
  unlist() %>% 
  str_subset("^‘") %>% 
  str_replace_all("[’‘]","")

二路解法

y = x %>%
  str_replace_all("’","‘") %>% 
  str_split("‘") %>% 
  unlist()
y[str_length(y)<20]  
#[1] "BiocParallel"  "feather"       "geometry"      "ggraph"        "RcppArmadillo"
#[6] "vegan"         "zip"   

分隔符只能有一个,如果事实有两个,那就替换一下。

有没有解法三?我正则表达式学艺不精,不会表示"两个引号中间的字符",但我jio的肯定可以实现,如果你刚好会,不妨告诉我啊。
不定期的放一下我的微信二维码。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小洁忘了怎么分身

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值