r入门，数据类型

77和11

于 2024-10-09 09:25:26 发布

阅读量343

点赞数 8

文章标签： r语言

本文链接：https://blog.csdn.net/2403_87131305/article/details/142777208

版权

矩阵

matrix()函数可以创建矩阵

例如y<- matrix(1:20, nrow =5, ncol = 4)创建的就是一个从1-20数值的矩阵，五行四列，默认按列填充。（下面这个填充我还会区分）

y [,1] [,2] [,3] [,4]

[1,] 1 6 11 16

[2,] 2 7 12 17

[3,] 3 8 13 18

[4,] 4 9 14 19

[5,] 5 10 15 20

cell<- c(1,3,5,7)

row.names<- c("R1","R2")

col.names<- c("c1", "c2")

mymatrix<- matrix(cell, nrow = 2, ncol = 2, byrow = TRUE,

dimnames = list(row.names,col.names))

mymatrix

(其中的dimnames=list（）函数用于给行列加行名列名，byrow = TRUE表明矩阵按行填充，，false是按列填充即byrow=false注意true和false要大写）

mymatrix

c1 c2

R1 1 3

R2 5 7

如上表示的是按行填充的1,3,5,7。

mymatrix<- matrix(cell, nrow = 2, ncol = 2, byrow = FALSE,

dimnames = list(row.names,col.names))

mymatrix

c1 c2

R1 1 5

R2 3 7

按列填充

矩阵数值的选取

y [,1] [,2] [,3] [,4]

[1,] 1 6 11 16

[2,] 2 7 12 17

[3,] 3 8 13 18

[4,] 4 9 14 19

[5,] 5 10 15 20

这里我们用之前创造的y数据矩阵，用y[，]可以选取第几行第几列的数值，例如y[2,3]选取的是第二行第三列，应该是12。

y[2,3][1] 12

然后还可以选择多个向量，例如y[2,]代表选取第二行所有的数据

y[2,][1] 2 7 12 17

此外y[,2]代表选取第二列所有的数据

y[,2][1] 6 7 8 9 10

Y[1,c(2,3)]代表选取第一行第二三个数值

y[1,c(2,3)][1] 6 11

数组

用array（）创建数组

dim1<- c("a1", "a2")

dim2<- c("b1", "b2", "b3")

dim3<- c("c1", "c2", "c3", "c4")

z<- array(1:24, c(2,3,4), dimnames = list(dim1,dim2,dim3))

, , c1

b1 b2 b3

a1 1 3 5

a2 2 4 6

, , c2

b1 b2 b3

a1 7 9 11

a2 8 10 12

, , c3

b1 b2 b3

a1 13 15 17

a2 14 16 18

, , c4

b1 b2 b3

a1 19 21 23

a2 20 22 24

在数组中数据的选取类似于矩阵，例如z[1,2,2]代表选取第二个数据集的第一行第二列数据

z[1,2,2][1] 9

数据框

patientID<- c(1, 2, 3, 4)

age<- c(25, 34, 28, 52)

diabetes<- c("type1", "type2", "type1", "type2")

status<- c("poor", "improved", "excellent", "poor")

patientdata<- data.frame(patientID, age, diabetes, status)

patientdata

patientID age diabetes status

1 1 25 type1 poor

2 2 34 type2 improved

3 3 28 type1 excellent

4 4 52 type2 poor

在数据框中数据选取也可以类似于之前的[1,2]是选取第一行第二个数据，[1:2]是选取一二列

patientdata[1,2]

> patientdata[1:2]

patientID age

1 1 25

2 2 34

3 3 28

4 4 52

选取特定的列

patientdata[c("diabetes", "status")]

diabetes status

1 type1 poor

2 type2 improved

3 type1 excellent

4 type2 poor

或者用$符号

patientdata$age[1] 25 34 28 52

用table（）生成关于某些内容的列联表

table(patientdata$diabetes, patientdata$status)

excellent improved poor

type1 1 0 1

type2 0 1 1

table(patientdata$age, patientdata$diabetes)

type1 type2

25 1 0

28 1 0

34 0 1

52 0 1

summary(mtcars$mpg)

plot(mtcars$mpg,mtcars$disp)

plot(mtcars$mpg,mtcars$wt)

用到mtcars数据进行处理，上述代码可用with（）简化

with(mtcars,{summary(mtcars$mpg)

plot(mtcars$mpg,mtcars$disp)

plot(mtcars$mpg,mtcars$wt)})

{}代表后续的操作都在mtcars数据中进行。

在with（mtcars，）中只能对mtcars中的数据集进行修改，如果想创建mtcars之外的数据需要用到<<-特殊赋值符

with(mtcars, {

stats <- summary(mpg) # 创建局部变量

stats1 <<- summary(mpg) # 创建全局变量

})

>stats

错误: 找不到对象'stats'

> stats1

Min. 1st Qu. Median Mean 3rd Qu. Max.

10.40 15.43 19.20 20.09 22.80 33.90

(注意在with（）中分隔符；而不是，否则会报错）

因子

# 创建一个字符向量

diabetes <- c("type1", "type2", "type1", "type2")

# 将字符向量转换为因子

diabetes <- factor(diabetes)

# 打印转换后的因子

print(diabetes)

# 将因子转换成数字

diabetes_numeric <- as.numeric(diabetes)

# 打印数字表示

print(diabetes_numeric)

print(diabetes_numeric)[1] 1 2 1 2

此时1=type1,2=type2

接下来对status进行因子化，同时排列

status<- factor(status, ordered = TRUE)

status

status.numeric<- as.numeric(status)

status.numeric

> status[1] poor improved excellent poor

Levels: excellent < improved < poor

> status.numeric[1] 3 2 1 3

默认情况是按照字母进行排序，但是这样的排序不能保证每次都让我们满意，因此我们需要自定义排序规则通过level（）函数。（确保所有被编译的值都是存在的并且确定的，如果没有被编译，会被列为缺失值）

> status<- factor(status, ordered = TRUE, levels =c("poor", "improved", "excellent") )

> status.numeric<- as.numeric(status)

> status.numeric

[1] 1 2 3 1

列表

列表就是可以把之前提到过得所有数据类型结合在一起

t<- c("my first list")

u<- c(12, 23, 32, 78)

w<- c("one", "two", "three")

x<- matrix(1:10, nrow = 5, ncol = 2)

mylist<- list(title=t, age=u, w, x, z)

mylist

$title

[1] "my first list"

$age

[1] 12 23 32 78

[[3]]

[1] "one" "two" "three"

[[4]]

[,1] [,2]

[1,] 1 6

[2,] 2 7

[3,] 3 8

[4,] 4 9

[5,] 5 10

[[5]]

, , c1

b1 b2 b3

a1 1 3 5

a2 2 4 6

, , c2

b1 b2 b3

a1 7 9 11

a2 8 10 12

, , c3

b1 b2 b3

a1 13 15 17

a2 14 16 18

, , c4

b1 b2 b3

a1 19 21 23

a2 20 22 24

列表数据的调取与之前略有不同，他没有坐标的那种调取方法例如[1,2,3]。他是一次调取一类，例如mylist[[2]]调取的就是第二类

> mylist[[2]]

[1] 12 23 32 78

直接用给他的命名也可以调取mylist["age"]

> mylist["age"]

$age

[1] 12 23 32 78

77和11

关注

8
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫