数据分析r和matlab比较,R 数据分析

卓钥

于 2021-03-19 06:05:02 发布

阅读量315

点赞数

文章标签：数据分析r和matlab比较

目录：

windows命令行中执行R

dataframe

常用函数、变量

1、windows命令行中执行R

前提：已经把R的命令目录加入了系统路径中。

在windows中，命令行执行R可以用以下两种方式：

(1)RCMD BATCH xxx.r

这种方式也可以写成”r cmd BATCH“、”rcmd BATCH“、”R CMD BATCH“，这几个命令都是一样的，随便你用哪个

这种方式的输出结果不是直接显示在命令行中，而是会在r文件相同路径下，自动创建一个xxx.r.Rout文本文件，输出的内容在这个文件里

但是这种方式用commandArgs()函数得不到传递的参数，而是生成了一个名为第一个参数的文本文件代替上面的xxx.r.Rout文件

举个例子，有以下test.r程序：

1 args = commandArgs(trailingOnly=TRUE)2 print(args[2])3 print('do a test')

比如在命令行输入”RCMD BATCH test.r 4 5“，就会生成一个名为4的文本文件，文件内容如下，程序打印的第二个参数是NA，实际上应该是5；

(2)Rscript xxx.r

这种方式的输出结果直接显示在命令行中，不会生成其他输出文件

这种方式可以用commandArgs()函数得到传递的参数

但是得到参数的索引由函数的trailingOnly参数决定，当trailingOnly＝TRUE时，参数索引从1开始；

当trailingOnly＝TRUE时，参数索引从6开始，因为：

args[1]= "C:\\Program Files\\R\\R-3.4.4\\bin\\x64\\Rterm.exe"args[2]= "--slave"args[3]= "--no-restore"args[4]="--file=test.r"args[5]="--args"args[6]=="4"args[7]=="do a test"

2、dataframe

创建空数据框

＃创建0行0列的数据框

df_empty= data.frame()

＃创建和df有同样多的列，0行的数据框>df_r = df[, FALSE]

data frame with 0 columnsand 4 rows

＃创建一个行数为0，列数、列名和df相同的数据框>df_c = df[FALSE, ]

[1] one two three<0 行> (或0-长度的row.names)

创建非空数据框

＃指定列名

> df = data.frame(one=c(1,2,3,4),two=c(4,5,6,0),three=c(32,21,34,32))

one two three1 1 4 32

2 2 5 21

3 3 6 34

4 4 0 32

＃不指定列名> df = data.frame(c(1,2,3,4),c(4,5,6,0),c(32,21,34,32))

c.1..2..3..4. c.4..5..6..0. c.32..21..34..32.1 1 4 32

2 2 5 21

3 3 6 34

4 4 0 32

＃自动匹配最长的行数> data.frame(one_t=c(5,2),two=c(12),three=c(9))

one_t two three1 5 12 9

2 2 12 9

＃指定行名和列名> data.frame(one=c(1,2,3,4),two=c(4,5,6,0),three=c(32,21,34,32),row.names = c('a','b','c','d'))

one two three

a1 4 32b2 5 21c3 6 34d4 0 32

访问元素

＃默认访问元素是对列而言的，可以通过加逗号来限定

＃＃用索引访问

＃访问列> df[1:2]

one two1 1 4

2 2 5

3 3 6

4 40> df[,1:2]

one two1 1 4

2 2 5

3 3 6

4 40

＃访问行> df[c(1,3),]

one two three1 1 4 32

3 3 6 34

> df[1:2,]

one two three1 1 4 32

2 2 5 21＃取反> df[-c(1,3),]

one two three2 2 5 21

4 4 0 32＃＃用列名和行名访问

＃列名> df['one']

one1 1

2 2

3 3

4 4

> df[,'one']

[1] 1 2 3 4

> df['one',]

one two three

NA NA NA NA

＃行名> df['1',]

one two three1 1 4 32

> df['1']

Errorin `[.data.frame`(df, "1") : undefined columns selected

数据筛选

＃条件语句选择列> df[which(df$one>2),]

one two three3 3 6 34

4 4 0 32＃取反> df[-which(df$one>2),]

one two three1 1 4 32

2 2 5 21＃支持逻辑符，＆和，|或> df[which(df$one>1 & df$two>0),]

one two three2 2 5 21

3 3 6 34

判断是否为数据框

> is.data.frame(df)

[1] TRUE

修改行名和列名

>names(df)

[1] "one" "two" "three"

> names(df)[1]='one_m'

>names(df)

[1] "one_m" "two" "three"

>colnames(df)

[1] "one" "two" "three"

> colnames(df)[1]='one_t'

>colnames(df)

[1] "one_t" "two" "three"

>rownames(df)

[1] "1" "2" "3" "4"

> rownames(df)[1]='9'

>rownames(df)

[1] "9" "2" "3" "4"

cbind 列连接

＃当df_n的行数和df的行数一样时> data.frame(one=c(9,8,7,6))

one1 9

2 8

3 7

4 6

>cbind(df,df3)

one two three one1 1 4 32 9

2 2 5 21 8

3 3 6 34 7

4 4 0 32 6＃当df_n的行数小于df，但是df的行数是df_n的整数倍时> df2 = data.frame(one=c(5),two=c(12),three=c(9))

one two three1 5 12 9

>cbind(df,df2)

one two three one two three1 1 4 32 5 12 9

2 2 5 21 5 12 9

3 3 6 34 5 12 9

4 4 0 32 5 12 9＃甚至这样的时候> cbind(df,data.frame(one=c(5,2),two=c(12),three=c(9)))

one two three one two three1 1 4 32 5 12 9

2 2 5 21 2 12 9

3 3 6 34 5 12 9

4 4 0 32 2 12 9

当被df_n的行数大于df时，会失败

> df4=data.frame(one=c(9,8,7,6,12))>cbind(df,df4)

Errorin data.frame(..., check.names =FALSE) :

参数值意味着不同的行数:4, 5

虽然df_n的行数小于df，但df不是df_n行数的整数倍时，也会失败

> cbind(df,data.frame(one=c(5,2,3),two=c(12),three=c(9)))

Errorin data.frame(..., check.names =FALSE) :

参数值意味着不同的行数:4, 3

rbind 行连接

> rbind(df, data.frame(one_t=c(5),two=c(12),three=c(9)))

one_t two three9 1 4 32

2 2 5 21

3 3 6 34

4 4 0 32

1 5 12 9

当数据框的列名不一致、列数目不一致时，都会失败

> rbind(df, data.frame(one=c(5,2,1,2),two=c(12,4,6,8),three=c(9,4,2,1)))

Errorinmatch.names(clabs, names(xi)) : 名字同原来已有的名字不相对> rbind(df, data.frame(one_t=c(5),two=c(12),three=c(9),four=c(4)))

Errorin rbind(deparse.level, ...) : 变量的列数不对

其他

＃数据框的长度是列的数目>length(df)

[1] 3＃列数>ncol(df)

[1] 3＃行数>nrow(df)

[1] 4

3、基本统计函数

> sum(c(1,2,3))

[1] 6

> mean(c(1,2,3))

[1] 2

> var(c(1,2,3))

[1] 1

> sort(c(2,1,3))

[1] 1 2 3

3、常用函数、变量

＃查看数据结构和类型

>mode(df)

[1] "list"

> class(df)

[1] "data.frame">str(df)'data.frame': 4 obs. of 3variables:

$ one_t: num1 2 3 4$ two : num4 5 60

$ three: num32 21 34 32

> typeof(12)

[1] "double"

＃大/小写字母> LETTERS[1:3]

[1] "A" "B" "C"

> letters[1:3]

[1] "a" "b" "c"

＃可放回抽样> sample(c(1,2,3,4), 10, replace =TRUE)

[1] 2 3 1 2 3 3 4 3 3 4

# 判断是否为空, 返回一个同类型(数组)的布尔值> is.na(c(1,2,3,NaN))

[1] FALSE FALSE FALSE TRUE

#generating regular suquences

> 1:5[1] 1 2 3 4 5

> 2*1:5[1] 2 4 6 8 10

> seq(1,5)

[1] 1 2 3 4 5

#设定序列间隔

> seq(1,5,0.5)

[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

#固定序列长度

> seq(1,5,length.out = 4)

[1] 1.000000 2.333333 3.666667 5.000000

#重复整个序列

> rep(c(1,2,3), times=5)

[1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3

#重复序列单个元素

> rep(c(1,2,3), each=5)

[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。