数组是一个K维的数据表;矩阵是数组的特例,k=2;向量也是,k=1;
向量、矩阵和数组的所有元素必须是同一类型的,这个和数据框(data.frame)和列表(list)是不一样的。
对于一个向量,属性由类型和长度组成,但是对于矩阵和数组,包括三个:类型,长度,维数(dim)。
数组的建立:
array(data,dim,dimnames);dim(A);colnames(A);rownames(A);dimnames(A);
data为一个向量,元素用于构建数组;
dim为数组的维数向量,数值型的向量;
dimnames为和维数对应的名称的向量,缺省为空。
***数据填充规则:111,211,121,221,131,231,112,212,122,222,132,232
***data不够会使用循环准则。
Example:
> a <- array(1:12, dim=c(2,3,2), dimnames=list(c('a','b'),c('c','d','e'),c('f','g'))
+ )
> a
, , f
c d e
a 1 3 5
b 2 4 6
, , g
c d e
a 7 9 11
b 8 10 12
> colnames(a)
[1] "c" "d" "e"
> rownames(a)
[1] "a" "b"
> dimnames(a)
[[1]]
[1] "a" "b"
[[2]]
[1] "c" "d" "e"
[[3]]
[1] "f" "g"
矩阵的建立:
array();矩阵也是数组,row=行,column=列
matrix(data, nr, nc, rownames, dimnames, colnames, dim,byrow=FALSE);矩阵建立
diag(data, nr, nc, rownames, dimnames, colnames, dim, byrow=FALSE);对角矩阵
Example:
> a <- array(1:6, c(3,2)) #数组的建立方法,注意数字的排列
> a
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
> b <- array(1:4, c(3,2))
> b
[,1] [,2]
[1,] 1 4
[2,] 2 1
[3,] 3 2
> c <- matrix(7,nr=2,nc=3)
> c
[,1] [,2] [,3]
[1,] 7 7 7
[2,] 7 7 7
> c <- matrix(1:6,nr=2,nc=3)
> c
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> c <- diag(3) #对角阵默认为1
> c
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
> c <- c("a","b","c","d") #可以通过变量建立任意对角阵
> diag(c)
[,1] [,2] [,3] [,4]
[1,] "a" "0" "0" "0"
[2,] "0" "b" "0" "0"
[3,] "0" "0" "c" "0"
[4,] "0" "0" "0" "d"
> diag(7, nr=3, nc=5) #非方阵的两种情况
[,1] [,2] [,3] [,4] [,5]
[1,] 7 0 0 0 0
[2,] 0 7 0 0 0
[3,] 0 0 7 0 0
> diag(7, nr=5, nc=3)
[,1] [,2] [,3]
[1,] 7 0 0
[2,] 0 7 0
[3,] 0 0 7
[4,] 0 0 0
[5,] 0 0 0
> X <- matrix(1:6,2)
> X
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> rownames(X) <- c("a","b") #添加行标
> colnames(X) <- c('c','d','e') #添加列标
> X
c d e
a 1 3 5
b 2 4 6
> dim(X)
[1] 2 3
> dimnames(X) #提取维数的名称
[[1]]
[1] "a" "b"
[[2]]
[1] "c" "d" "e"
> X <- matrix(1:6,2,byrow=TRUE) #默认情况按照列就行排序,可以按照行就行排序。
> X
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
> X <- matrix(1:4,2,3,byrow=TRUE) #循环准则不是倍数,可以执行,但是会出警告
警告信息:
In matrix(1:4, 2, 3, byrow = TRUE) : 数据长度[4]不是矩阵列数[3]的整倍数
> X
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 1 2
> diag(1:4, nr=5, nc=4)
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 2 0 0
[3,] 0 0 3 0
[4,] 0 0 0 4
[5,] 0 0 0 0
数组与矩阵的下标(index)与子集(元素)的提取
同向量的下标一样,矩阵和数组的小标可以使用正整数、负整数和逻辑表达式,从而实现子集的提取或修改。
Example:
> a <- matrix(1:6,2,3) #构建矩阵
> a
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> a[2,3] #提取元素
[1] 6
> a[3,] #无意义的报错
错误: 下标出界
> a[2,] #提取一行
[1] 2 4 6
> a[,3] #提取一列
[1] 5 6
> a[,3,drop=FALSE] #格式化表示
[,1]
[1,] 5
[2,] 6
> a[,c(2,3),drop=FALSE]
[,1] [,2]
[1,] 3 5
[2,] 4 6
> a[-1,] #去掉第一行
[1] 2 4 6
> a[,-1] #去掉第一列
[,1] [,2]
[1,] 3 5
[2,] 4 6
> a[,-2] #去掉第二列
[,1] [,2]
[1,] 1 5
[2,] 2 6
> a[,-3] #去掉第三列
[,1] [,2]
[1,] 1 3
[2,] 2 4
> a[3,] <- NA #不能超过定义的值就行赋值
错误于a[3, ] <- NA : 下标出界
> a[2,] <- "o" #修改原来的数值
> a
[,1] [,2] [,3]
[1,] "1" "3" "5"
[2,] "o" "o" "o"
> a[2,] <- NA #定义缺失 “NA”
> a
[,1] [,2] [,3]
[1,] "1" "3" "5"
[2,] NA NA NA
> a[is.na(a)] <- 7 #把缺失值赋值为7 is.na为判断函数
> a
[,1] [,2] [,3]
[1,] "1" "3" "5"
[2,] "7" "7" "7"
对矩阵的运算(函数)
1)矩阵的代数运算
t() 转置
diag() 取矩阵的对角线元素
rbind() 按行合并矩阵
cbind() 按列合并矩阵
"*" 矩阵的逐元乘积,也就是对应元素相乘,要求对应矩阵维数一致
"%*%" 矩阵的代数乘积,就是矩阵乘法。
det() 方阵的行列式
crossprod() 交叉乘积(cross product),对应行列之间互乘
eigen() 特征根与特征向量
qr() QR分解
Example:
> X <- matrix(1:6,2,byrow=TRUE)
> X
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
> t(X)
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
> diag(X)
[1] 1 5
> Y <- matrix(7:12,2,byrow=TRUE)
> Y
[,1] [,2] [,3]
[1,] 7 8 9
[2,] 10 11 12
> rbind(X,Y)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
> cbind(X,Y)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 7 8 9
[2,] 4 5 6 10 11 12
> cbind(X<Y)
[,1] [,2] [,3]
[1,] TRUE TRUE TRUE
[2,] TRUE TRUE TRUE
> X
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
> Y
[,1] [,2] [,3]
[1,] 7 8 9
[2,] 10 11 12
> X*Y
[,1] [,2] [,3]
[1,] 7 16 27
[2,] 40 55 72
> t(Y)
[,1] [,2]
[1,] 7 10
[2,] 8 11
[3,] 9 12
> X%*%t(Y)
[,1] [,2]
[1,] 50 68
[2,] 122 167
> det(X%*%t(Y))
[1] 54
> crossprod(X,Y)
[,1] [,2] [,3]
[1,] 47 52 57
[2,] 64 71 78
[3,] 81 90 99
> eigen(crossprod(X,Y))
$values
[1] 2.167509e+02 2.491340e-01 -2.728224e-15
$vectors
[,1] [,2] [,3]
[1,] -0.4129031 -0.75120198 0.4082483
[2,] -0.5640129 -0.04638054 -0.8164966
[3,] -0.7151227 0.65844090 0.4082483
> qr(crossprod(X,Y))
$qr
[,1] [,2] [,3]
[1,] -113.4283915 -125.8767740 -1.383252e+02
[2,] 0.5642326 -0.1943553 -3.887106e-01
[3,] 0.7141069 0.9793598 7.556618e-15
$rank
[1] 2
$qraux
[1] 1.414358e+00 1.202124e+00 7.556618e-15
$pivot
[1] 1 2 3
attr(,"class")
[1] "qr"
2)矩阵的统计运算
max();min();median();var();sd();sum();cumsum();cumprod();cummax();cummin();cov();cor();后面两个主要是为了计算矩阵的协方差阵和相关系数阵。
apply(X, MARGIN, FUN);
X 矩阵
FUN 函数名或“+-*/"
MARGIN=1 按列计算
MARGIN=2 按行计算
sweep(X,MARGIN,STATS,FUN)
表示从矩阵X中按照MARGIN计算STATS,并从X中除去(sweep out);
Example:
> rnorm(n=12) #随机数12个
[1] 0.51283136 2.67288307 0.57151842 -0.85470819 -1.36383499 -1.93869581 2.70833337 1.78806889 0.72683697 -0.03599419
[11] 1.35793529 0.28600715
> a <- matrix(rnorm(n=12), nr=3) #随机数矩阵
> a
[,1] [,2] [,3] [,4]
[1,] -0.60804198 1.608885 -0.6733028 0.4095975
[2,] 0.34911838 0.791119 -0.4506696 -0.9684538
[3,] 0.08997728 -2.008298 -1.1228689 -0.8552991
> apply(a, MARGIN=1, FUN=mean) #按列计算平均值
[1] 0.18428437 -0.06972151 -0.97412210
> apply(a, MARGIN=2, FUN=mean) #按行计算平均值
[1] -0.05631544 0.13056867 -0.74894710 -0.47138512
> a <- matrix(1:12, 3 )
> a
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
> apply(a, MARGIN=1, FUN=mean)
[1] 5.5 6.5 7.5
> apply(a, MARGIN=2, FUN=mean)
[1] 2 5 8 11
> scale(a, center=T, scale=T) #标准化,具体怎么标准化的,没有看懂??????
[,1] [,2] [,3] [,4]
[1,] -1 -1 -1 -1
[2,] 0 0 0 0
[3,] 1 1 1 1
attr(,"scaled:center")
[1] 2 5 8 11
attr(,"scaled:scale")
[1] 1 1 1 1
> row.med <- apply(a, MARGIN=1, FUN=median) #按列求取中位数
> row.med
[1] 5.5 6.5 7.5
> col.med
错误: 找不到对象'col.med'
> col.med <- apply(a, MARGIN=2, FUN=median) #按行求取中位数
> col.med
[1] 2 5 8 11
> sweep(a,MARGIN=1, STATS=row.med, FUN='-') #按行减去求出的中位数
[,1] [,2] [,3] [,4]
[1,] -4.5 -1.5 1.5 4.5
[2,] -4.5 -1.5 1.5 4.5
[3,] -4.5 -1.5 1.5 4.5
####
到此就结束了,内容比较多,比较杂,要多实践一下,没事就多重新造数据演示一下,反正免费的,呵呵