R语言学习(四)——对数据进行操作

最新推荐文章于 2023-04-24 18:52:02 发布

EverestRs

最新推荐文章于 2023-04-24 18:52:02 发布

阅读量1.6k

点赞数 2

分类专栏：大数据概论文章标签： R语言大数据矩阵

本文链接：https://blog.csdn.net/everestrs/article/details/83217245

版权

大数据概论专栏收录该内容

16 篇文章 7 订阅

订阅专栏

判断变量的属性

is.character(x)       #判断是否为字符型
is.numeric(x)         #判断是否为数值型
is.vector(x)          #判断是否为一个向量
is.matrix(x)          #判断是否为一个矩阵   
is.array(x)           #判断是否为一个数组
is.data.frame(x)      #判断是否为一个数据框

类的转换

as.numeric()          #转换为数值型
as.logical()          #转换为逻辑型
as.charactor()        #转换为字符串
as.matrix()           #转换为矩阵
as.data.frame()       #转换为数据框
as.factor()           #转换为因子

创建一个向量

#字符型：
character<-c("China", "UK", "USA", "France", "Russia") 
#数值型：
numeric<-c(1, 3, 6, 7, 3, 8, 6, 4)
#逻辑型：
logical<-c(T, F, T, F, T, F, F, T)

生成向量的函数

c(),rep(),seq(),”:”

c()

> x<-c(1,2,3,4)
> x
[1] 1 2 3 4

rep()

rep(x, times = )
#参数：
x	    一个向量或因子
times   重复次数

> x<-rep(1,times=3)
> x
[1] 1 1 1

seq()

> x<-seq(from=3, to=21, by=3 )
> x
[1]  3  6  9 12 15 18 21

“:”

> x<-1:10
> x
 [1]  1  2  3  4  5  6  7  8  9 10
 
> x<-10:1
> x
 [1] 10  9  8  7  6  5  4  3  2  1

通过与向量的组合，可以产生更为复杂的向量。
如：

> x<-rep(1:2,c(10,15))
> x
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

创建一个矩阵

matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)

> x <- matrix(1:20,nrow=5,ncol=4,byrow=T) 
> x 
[,1] [,2] [,3] [,4] 
[1,] 1 2 3 4 
[2,] 5 6 7 8 
[3,] 9 10 11 12 
[4,] 13 14 15 16 
[5,] 17 18 19 20

> is.matrix(x) 
[1] TRUE

> dim(x) #查看或设置数组的维度向量 
[1] 5 4
注意：此时不可以再向dim(x)赋值，如：dim(x) <- c(6,4) ，会有如下报错：
Error in dim(x) <- c(4, 4) : dims [product 16] 因为与对象长度[20]不匹配
但是，可以这样使用：
> x <- 1:20 
> dim(x) <- c(5,4)  
> x 
[,1] [,2] [,3] [,4] 
[1,] 1 6 11 16 
[2,] 2 7 12 17 
[3,] 3 8 13 18 
[4,] 4 9 14 19 
[5,] 5 10 15 20

> attributes(x)  #使用attributes()函数将返回一个列表，其中的第一个元素是dim，dim中包含向量(5,4)
$`dim`
[1] 5 4

提取对角线元素

> diag(x)

将一个矩阵变成上三角矩阵

> x[lower.tri(y)]<-NA

将一个矩阵变成下三角矩阵

> x[upper.tri(y)]<-NA

将矩阵转换为数据框

as.data.frame(x)

创建一个列表

列表是可以包含多种类型的对象的向量。

> x<-list(1,"A",FALSE,5+6i)
> x
[[1]]
[1] 1

[[2]]
[1] "A"

[[3]]
[1] FALSE

[[4]]
[1] 5+6i

#列表还可以这样访问
> x[1]
[[1]]
[1] 1

> x[2]
[[1]]
[1] "A"

> x[3]
[[1]]
[1] FALSE

> x[4]
[[1]]
[1] 5+6i

创建一个数组

array(data = NA, dim = length(data), dimnames = NULL)

> x<-array(2:6,c(2,4))  #生成一个数值在2到6之间的数组，这个数组为两行四列
> x
     [,1] [,2] [,3] [,4]
[1,]    2    4    6    3
[2,]    3    5    2    4

创建一个数据框

data.frame():

data.frame(..., row.names = NULL, check.rows = FALSE,
           check.names = TRUE, fix.empty.names = TRUE,
           stringsAsFactors = default.stringsAsFactors())

例如：

> student<-data.frame(ID=c(11,12,13),Name=c("Devin","Edward","Wenli"),Gender=c("M","M","F"),Birthdate=c("1984-12-29","1983-5-6","1986-8-8"))
> student
  ID   Name Gender  Birthdate
1 11  Devin      M 1984-12-29
2 12 Edward      M   1983-5-6
3 13  Wenli      F   1986-8-8

利用cbind()和rbind()函数来创建数据框

> x<-1:3
> x
[1] 1 2 3
> y<-10:12
> y
[1] 10 11 12
> cbind(x,y)
     x  y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x,y)
  [,1] [,2] [,3]
x    1    2    3
y   10   11   12

访问数据的前六行

> head(x)

去除某行

> x[-1,]

去除某列

> x[,-1]

查看或设置行名

rownames(x) 
rownames(x) <- c(‘a’,’b’,’c’,’d’,’e’)

查看或设置列名

colnames(x) 
colnames(x) <- c(‘a’,’b’,’c’,’d’,’e’)

如果一个数据框的每个列都是数值型的，求行的加和

rowSums(x)

数据读取

常用read.table() 和reas.csv()

read.table(file, header = FALSE, sep = "", quote = "\"'",
           dec = ".", numerals = c("allow.loss", "warn.loss", "no.loss"),
           row.names, col.names, as.is = !stringsAsFactors,
           na.strings = "NA", colClasses = NA, nrows = -1,
           skip = 0, check.names = TRUE, fill = !blank.lines.skip,
           strip.white = FALSE, blank.lines.skip = TRUE,
           comment.char = "#",
           allowEscapes = FALSE, flush = FALSE,
           stringsAsFactors = default.stringsAsFactors(),
           fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)

read.csv(file, header = TRUE, sep = ",", quote = "\"",
         dec = ".", fill = TRUE, comment.char = "", ...)

read.csv(file.choose())  #在文件目录中选择需要的文件

数据写入

write.table(x, file = "", append = FALSE, quote = TRUE, sep = " ",
            eol = "\n", na = "NA", dec = ".", row.names = TRUE,
            col.names = TRUE, qmethod = c("escape", "double"),
            fileEncoding = "")

write.csv()和write.csv2()用法与write.table()相似

write.csv(data, file = "data.csv")

需要把行名去掉，需要使用row.names = FALSE或者row.names = F

write.csv(data, file = "data.csv", row.names = FALSE)

有时候数据里面存在NA，要去掉NA的，就再加一个na = "“

write.csv(data, file = "data.csv", row.names = FALSE, na = "")

要省略列名的话，需要使用write.table()的col.names = FALSE, sep = ",“

write.table(data, file = “data.csv”, row.names = FALSE, na = “”, col.names = FALSE, sep = ",")

EverestRs

关注

2
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
R语言学习(四)——对数据进行操作

判断变量的属性is.character(x) #判断是否为字符型is.numeric(x) #判断是否为数值型is.vector(x) #判断是否为一个向量is.matrix(x) #判断是否为一个矩阵 is.data.frame(x) #判断是否为一个数据框创建一个矩阵&amp;amp;amp;amp;amp;amp;gt; x &amp;amp;amp;amp;a
复制链接

扫一扫