Day_1 Part_4 Structures of R

1. Vector/Matrix/Array

1.1. What are they

  • Collection of observations
    – Vector – 1 dimensional
    – Matrix – 2 dimensional
    – Array – 3 dimensional
  • Class in vector/matrix/array
    – Only one class per object
    – Combined – class determined: factor/logical < integer < numeric < character (当出现一个vector/matrix/array 里面有多种class的数据时,决定顺序如上。比如一个character + numeric最后的class是character)
    example:

1.2. Vector

1.2.1 Generate Vector

> vec1 <- c(1, 2, 3)
> vec2 <- 1:11

> vec3 <- rep(x = 4, 7)
> vec3
[1] 4 4 4 4 4 4 4

> vec4 <- seq(from = 1, to = 12, by = 1.333)
> vec4
[1]  1.000  2.333  3.666  4.999  6.332  7.665  8.998 10.331 11.664

> vec5 <- seq(1, 12, lenght.out = 10) # 1到12之间,等分取10个数
## Warning: In seq.default(1, 12, lenght.out = 10) :
## extra argument 'lenght.out' will be disregarded
> vec5     #即1, 1+(12-1)/(10-1), 1+2*[(12-1)/(10-1)], ..., 1+(10-1)*[(12-1)/(10-1)]
[1]  1.000000  2.222222  3.444444  4.666667  5.888889  7.111111  8.333333 
[8]  9.555556 10.777778 12.000000

1.2.2 Index Vector

  • Indexing by []
  • 负号 - 在R的索引里意味着去除该元素
  • 区别于其他语言, R 的索引是从1开始的

Example

> vec4
[1]  1.000  2.333  3.666  4.999  6.332  7.665  8.998 10.331 11.664

vec4[3]
## [1] 3.666
vec4[1:3]
## [1] 1.000 2.333 3.666
vec4[23:24]
## [1] NA NA

vec4[c(1, 3, 7)]
## [1] 1.000 3.666 8.998
vec4[c(1, 1, 1, 2)]
## [1] 1.000 1.000 1.000 2.333

vec4[-4]     #access everything but the 4th element
## [1] 1.000 2.333 3.666 6.332 7.665 8.998 10.331 11.664

vec4[-1:3] # this will cause error
## Error in vec4[-1, 3] : 量度数目不对
# It’s related to the way -1:3 is interpreted.
-1:3
## [1] -1 0 1 2 3

vec4[-(1:3)]  #brackets will help negate all elements of 1:3
## [1] 4.999 6.332 7.665 8.998 10.331 11.664
vec4[-c(4, 5, 7)]
## [1] 1.000 2.333 3.666 7.665 10.331 11.664

vec4[9:1] #listing the vector in reverse
## [1] 11.664 10.331 8.998 7.665 6.332 4.999 3.666 2.333 1.000

1.3. Matrix

1.3.1. Generate Matrix

matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL)
参数:

参数description
dataan optional data vector (including a list or expression vector). Non-atomic classed R objects are coerced by as.vector and all attributes discarded.
nrowthe desired number of rows.
ncolthe desired number of columns.
byrowlogical. If FALSE (the default) the matrix is filled by columns, otherwise the matrix is filled by rows. 即 byrow=T 是横着往里填数据,byrow=F 是竖着往里填数据
dimnamesA dimnames attribute for the matrix: NULL or a list of length 2 giving the row and column names respectively. An empty list is treated as NULL, and a list of length one as row names. The list can be named, and the list names will be used as names for the dimensions.
> MT <- matrix(c(1,2,1,2),nrow=2,ncol=2,byrow=T)
> MT
    [,1] [,2]
[1,] 1 2
[2,] 1 2

# another example
> MT2<-matrix(c(1,2,1,2),nrow=3,ncol=5,byrow=F)
Warning message:
In matrix(c(1, 2, 1, 2), nrow = 3, ncol = 5, byrow = F):数据长度[4]不是矩阵行数[3]的整倍
> MT2
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    1    2    1
[2,]    2    1    2    1    2
[3,]    1    2    1    2    1

1.3.2. Index Matrix

用法:matrix[row,column]

> MT[1,1] #即取第一行第一列的元素。
[1] 1

1.4. Array

1.4.1. Generate Array

array(data = NA, dim = length(data), dimnames = NULL)

参数description
dataa vector (including a list or expression vector) giving data to fill the array. Non-atomic classed objects are coerced by as.vector.
dimthe dim attribute for the array to be created, that is an integer vector of length one or more giving the maximal indices in each dimension.
dimnameseither NULL or the names for the dimensions. This must a list (or it will be ignored) with one component for each dimension, either NULL or a character vector of the length given by dim for that dimension. The list can be named, and the list names will be used as names for the dimensions. If the list is shorter than the number of dimensions, it is extended by NULLs to the length required.
> ARR <- array(1:4, c(1,2,2)) #参数 c(1,2,2) 意思是1个row, 2个col, 2个layer。填入的数据是1:4。
> ARR
,,1
	[,1] [,2]
[1,] 1 2
,,2
	[,1] [,2]
[1,] 3 4

1.4.2. Index Array

Array[row,column,layer]

> ARR[1,2,2]
[1] 4
> ARR[1,1,2]
[1] 3
> ARR[1,1,1]
[1] 1
> ARR[1,2,1]
[1] 2

1.5. Arithmetic for matrix and arrays

OperationsFunction
Number of rows/columsnrow(x)/ncol(x)
Length of all elementslength(x)
Names of rows and columnsnames()
Dimension ; 对于array返回(row,col,layer)dim()
Transposet(x)
Matrix multiplicationx %*% y
Cross productcrossprod(x,y)
Diagonal elementsdiag(x)

NB:
(1)t(x) 这里有一篇文章讲的很清楚:http://blog.sciencenet.cn/blog-508298-551299.html
(2) x %*% y 矩阵乘法;crossprod(x,y)俩矩阵的向量积

2. list

2.1. Description:

  • Collection of variables
    – Different classes/structures are possible (可以将不同结构的数据放在一个list中,甚至list里面嵌套list。如一个list里一层是dataframe,一层是array,一层是plot/list)
    – Different dimensions is possible

2.2. Generate & Index List

Created with list()
Indexed with [[ ]]

# create
my_character <-c(1, 1, 0, 0)
my_logical <- TRUE
> my_list <- list(my_character, my_logical)
> my_list
 	[[1]]
[1] 1 1 0 0

	[[2]]
[1]  TRUE

# index
> my_list[[1]]
[1] “1” “1” “0” “0”
> my_list[[1]][3]
[1] “0

3. Dataframe

3.1. Description:

  • Variables in data frame can have different classes
  • ncol(x)/nrow(x) returns number of columns/rows
  • Columns and rows are named - colnames/rownames

3.2. Generate Dataframe

生成:
data.frame(..., row.names = NULL, check.rows = FALSE, check.names = TRUE, fix.empty.names = TRUE, stringsAsFactors = default.stringsAsFactors())
参数表:

参数description
...these arguments are of either the form value or tag = value. Component names are created based on the tag (if present) or the deparsed argument itself.
row.namesNULL or a single integer or character string specifying a column to be used as row names, or a character or integer vector giving the row names for the data frame.
check.rowsif TRUE then the rows are checked for consistency of length and names.
check.nameslogical. If TRUE then the names of the variables in the data frame are checked to ensure that they are syntactically valid variable names and are not duplicated. If necessary they are adjusted (by make.names) so that they are.
fix.empty.nameslogical indicating if arguments which are “unnamed” (in the sense of not being formally called as someName = arg) get an automatically constructed name or rather name “”. Needs to be set to FALSE even when check.names is false if “” names should be kept.
stringsAsFactorslogical: should character vectors be converted to factors? The ‘factory-fresh’ default has been TRUE previously but has been changed to FALSE for R 4.0.0. Only as short time workaround, you can revert by setting options(stringsAsFactors = TRUE) which now warns about its deprecation.

Example:

> data <- data.frame(ID=rep(1:10, each=3),
					 TIME=c(0,6,12),
					 MDV=0)
> str(data)
'data.frame': 30 obs. of 3 variables:
$ ID : int 1 1 1 2 2 2 3 3 3 4 ...
$ TIME: num 0 6 12 0 6 12 0 6 12 0 ...
$ MDV : num 0 0 0 0 0 0 0 0 0 0 ...

3.3. Index Dataframe

用法:
dataframe[row,column]
dataframe$colname[row]

Example:

> data[3,2]
[1] 12

> data$TIME[3]
[1] 12

> data$TIME[data$TIME==12] <- 12.5 #把data$TIME里的12都换成12.5
> data$TIME
 [1]  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5
[22]  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5
> data[ ,2]
 [1]  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5
[22]  0.0  6.0 12.5  0.0  6.0 12.5  0.0  6.0 12.5

> data[3,]
  ID TIME MDV
3  1 12.5   0

3.4. View data frames

FunctionDescription
View(data.set)Open and look at data.set
head(data.set)Look at the first several lines of a data.set
tail(data.set)Look at the last several lines of a data.set
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值