- 判断变量的属性
is.character(x) #判断是否为字符型
is.numeric(x) #判断是否为数值型
is.vector(x) #判断是否为一个向量
is.matrix(x) #判断是否为一个矩阵
is.array(x) #判断是否为一个数组
is.data.frame(x) #判断是否为一个数据框
- 类的转换
as.numeric() #转换为数值型
as.logical() #转换为逻辑型
as.charactor() #转换为字符串
as.matrix() #转换为矩阵
as.data.frame() #转换为数据框
as.factor() #转换为因子
- 创建一个向量
#字符型:
character<-c("China", "UK", "USA", "France", "Russia")
#数值型:
numeric<-c(1, 3, 6, 7, 3, 8, 6, 4)
#逻辑型:
logical<-c(T, F, T, F, T, F, F, T)
生成向量的函数
c(),rep(),seq(),”:”
c()
> x<-c(1,2,3,4)
> x
[1] 1 2 3 4
rep()
rep(x, times = )
#参数:
x 一个向量或因子
times 重复次数
> x<-rep(1,times=3)
> x
[1] 1 1 1
seq()
> x<-seq(from=3, to=21, by=3 )
> x
[1] 3 6 9 12 15 18 21
“:”
> x<-1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> x<-10:1
> x
[1] 10 9 8 7 6 5 4 3 2 1
通过与向量的组合,可以产生更为复杂的向量。
如:
> x<-rep(1:2,c(10,15))
> x
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
- 创建一个矩阵
matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
> x <- matrix(1:20,nrow=5,ncol=4,byrow=T)
> x
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
[4,] 13 14 15 16
[5,] 17 18 19 20
> is.matrix(x)
[1] TRUE
> dim(x) #查看或设置数组的维度向量
[1] 5 4
注意:此时不可以再向dim(x)赋值,如:dim(x) <- c(6,4) ,会有如下报错:
Error in dim(x) <- c(4, 4) : dims [product 16] 因为与对象长度[20]不匹配
但是,可以这样使用:
> x <- 1:20
> dim(x) <- c(5,4)
> x
[,1] [,2] [,3] [,4]
[1,] 1 6 11 16
[2,] 2 7 12 17
[3,] 3 8 13 18
[4,] 4 9 14 19
[5,] 5 10 15 20
> attributes(x) #使用attributes()函数将返回一个列表,其中的第一个元素是dim,dim中包含向量(5,4)
$`dim`
[1] 5 4
提取对角线元素
> diag(x)
将一个矩阵变成上三角矩阵
> x[lower.tri(y)]<-NA
将一个矩阵变成下三角矩阵
> x[upper.tri(y)]<-NA
将矩阵转换为数据框
as.data.frame(x)
- 创建一个列表
列表是可以包含多种类型的对象的向量。
> x<-list(1,"A",FALSE,5+6i)
> x
[[1]]
[1] 1
[[2]]
[1] "A"
[[3]]
[1] FALSE
[[4]]
[1] 5+6i
#列表还可以这样访问
> x[1]
[[1]]
[1] 1
> x[2]
[[1]]
[1] "A"
> x[3]
[[1]]
[1] FALSE
> x[4]
[[1]]
[1] 5+6i
- 创建一个数组
array(data = NA, dim = length(data), dimnames = NULL)
> x<-array(2:6,c(2,4)) #生成一个数值在2到6之间的数组,这个数组为两行四列
> x
[,1] [,2] [,3] [,4]
[1,] 2 4 6 3
[2,] 3 5 2 4
- 创建一个数据框
data.frame():
data.frame(..., row.names = NULL, check.rows = FALSE,
check.names = TRUE, fix.empty.names = TRUE,
stringsAsFactors = default.stringsAsFactors())
例如:
> student<-data.frame(ID=c(11,12,13),Name=c("Devin","Edward","Wenli"),Gender=c("M","M","F"),Birthdate=c("1984-12-29","1983-5-6","1986-8-8"))
> student
ID Name Gender Birthdate
1 11 Devin M 1984-12-29
2 12 Edward M 1983-5-6
3 13 Wenli F 1986-8-8
利用cbind()和rbind()函数来创建数据框
> x<-1:3
> x
[1] 1 2 3
> y<-10:12
> y
[1] 10 11 12
> cbind(x,y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x,y)
[,1] [,2] [,3]
x 1 2 3
y 10 11 12
访问数据的前六行
> head(x)
去除某行
> x[-1,]
去除某列
> x[,-1]
查看或设置行名
rownames(x)
rownames(x) <- c(‘a’,’b’,’c’,’d’,’e’)
查看或设置列名
colnames(x)
colnames(x) <- c(‘a’,’b’,’c’,’d’,’e’)
如果一个数据框的每个列都是数值型的,求行的加和
rowSums(x)
- 数据读取
常用read.table() 和reas.csv()
read.table(file, header = FALSE, sep = "", quote = "\"'",
dec = ".", numerals = c("allow.loss", "warn.loss", "no.loss"),
row.names, col.names, as.is = !stringsAsFactors,
na.strings = "NA", colClasses = NA, nrows = -1,
skip = 0, check.names = TRUE, fill = !blank.lines.skip,
strip.white = FALSE, blank.lines.skip = TRUE,
comment.char = "#",
allowEscapes = FALSE, flush = FALSE,
stringsAsFactors = default.stringsAsFactors(),
fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)
read.csv(file, header = TRUE, sep = ",", quote = "\"",
dec = ".", fill = TRUE, comment.char = "", ...)
read.csv(file.choose()) #在文件目录中选择需要的文件
- 数据写入
write.table(x, file = "", append = FALSE, quote = TRUE, sep = " ",
eol = "\n", na = "NA", dec = ".", row.names = TRUE,
col.names = TRUE, qmethod = c("escape", "double"),
fileEncoding = "")
write.csv()和write.csv2()用法与write.table()相似
write.csv(data, file = "data.csv")
需要把行名去掉,需要使用row.names = FALSE
或者row.names = F
write.csv(data, file = "data.csv", row.names = FALSE)
有时候数据里面存在NA,要去掉NA的,就再加一个na = "“
write.csv(data, file = "data.csv", row.names = FALSE, na = "")
要省略列名的话,需要使用write.table()
的col.names = FALSE, sep = ",“
write.table(data, file = “data.csv”, row.names = FALSE, na = “”, col.names = FALSE, sep = ",")