stringr数据处理
前言:
在数据处理阶段,主要用到的是dplyr包,但随着数据的多样性和复杂性,对字符串的处理越来越重要,R语言基础的数据处理能力一般,且使用不够方便。为此,学习stringr包能解决字符串处理的所有问题,它建华了R语言中字符串的转换,搜索,辨识,定位,匹配,替换,提取,分离等操作,同时封装了一些复杂的字符串处理函数。
一、字符串拼接函数
1.word()函数:从句子中提取词组 - 调用公式:
word(string,start= ,end= ,sep=fixed(" "))
#sep为字符之间的分隔符,默认是空格
- 简单例子:
library(stringr)
data<-'Using R programming to work for data science'
#提取后两个字符
word(data,-2:-1)
## [1] "data" "science"
#从第1个单词开始,提取前3个单词
word(data,start=1,end=3)
## [1] "Using R programming"
2.str_wrap()函数:段落操作 - 调用公式:
str_wrap(string,width=80,indent=0,exdent=0)
# width:设定每行的宽度
# indent:设定每个段落第一行的缩进格式,默认无缩进
# exdent:设定每个段落除了第一行的缩进格式,默认无缩进
- 简单例子:
string<-"New York is 3 hours ahead of California, but it does not make California slow. Someone graduated at age of 22, but waited 5 years before securing a good job!"
str_wrap(string,width=80,indent=4)
## [1] " New York is 3 hours ahead of California, but it does not make California slow.\nSomeone graduated at age of 22, but waited 5 years before securing a good job!"
# \n 是换行符
# cat()函数,在转义符处连接句子
cat(str_wrap(string,indent=4),sep="\n")
## New York is 3 hours ahead of California, but it does not make California slow.