四、统计特征
1.平均值、中值
mean:平均数
median:中位数
nanmedian:忽略NaN的中位数
geomean:几何平均数
harmmean:调和平均数
Python
>> A = magic(5)
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> M1 = mean(A)
M1 =
13 13 13 13 13
>> M2 = median(A)
M2 =
11 12 13 14 15
>> M3 = nanmedian(A)
M3 =
11 12 13 14 15
>> M4 = geomean(A)
M4 =
11.1462 10.9234 8.4557 9.8787 10.7349
>> M5 = harmmean(A)
M5 =
9.2045 9.1371 3.8098 6.2969 8.0767
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
>>A=magic(5)
A=
17241815
23571416
46132022
101219213
11182529
>>M1=mean(A)
M1=
1313131313
>>M2=median(A)
M2=
1112131415
>>M3=nanmedian(A)
M3=
1112131415
>>M4=geomean(A)
M4=
11.146210.92348.45579.878710.7349
>>M5=harmmean(A)
M5=
9.20459.13713.80986.29698.0767
2.数据比较
sort:普通排序
sortrows:按行排序
range:求解值域大小
Python
>> A = rand(5)
A =
0.8147 0.0975 0.1576 0.1419 0.6557
0.9058 0.2785 0.9706 0.4218 0.0357
0.1270 0.5469 0.9572 0.9157 0.8491
0.9134 0.9575 0.4854 0.7922 0.9340
0.6324 0.9649 0.8003 0.9595 0.6787
>> S1 = sort(A)
S1 =
0.1270 0.0975 0.1576 0.1419 0.0357
0.6324 0.2785 0.4854 0.4218 0.6557
0.8147 0.5469 0.8003 0.7922 0.6787
0.9058 0.9575 0.9572 0.9157 0.8491
0.9134 0.9649 0.9706 0.9595 0.9340
>> S2 = sortrows(A)
S2 =
0.1270 0.5469 0.9572 0.9157 0.8491
0.6324 0.9649 0.8003 0.9595 0.6787
0.8147 0.0975 0.1576 0.1419 0.6557
0.9058 0.2785 0.9706 0.4218 0.0357
0.9134 0.9575 0.4854 0.7922 0.9340
>> S3 = range(A)
S3 =
0.7864 0.8673 0.8130 0.8176 0.8983
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
>>A=rand(5)
A=
0.81470.09750.15760.14190.6557
0.90580.27850.97060.42180.0357
0.12700.54690.95720.91570.8491
0.91340.95750.48540.79220.9340
0.63240.96490.80030.95950.6787
>>S1=sort(A)
S1=
0.12700.09750.15760.14190.0357
0.63240.27850.48540.42180.6557
0.81470.54690.80030.79220.6787
0.90580.95750.95720.91570.8491
0.91340.96490.97060.95950.9340
>>S2=sortrows(A)
S2=
0.12700.54690.95720.91570.8491
0.63240.96490.80030.95950.6787
0.81470.09750.15760.14190.6557
0.90580.27850.97060.42180.0357
0.91340.95750.48540.79220.9340
>>S3=range(A)
S3=
0.78640.86730.81300.81760.8983
3.方差(即期望 var)、标准差(std)
var:方差
std:标准差
skewness:三阶统计量斜度
Python
>> x = randn(8,2)
x =
1.0347 -0.8095
0.7269 -2.9443
-0.3034 1.4384
0.2939 0.3252
-0.7873 -0.7549
0.8884 1.3703
-1.1471 -1.7115
-1.0689 -0.1022
>> dx = var(x)
dx =
0.8040 2.2308
>> dx1 = var(x,1)
dx1 =
0.7035 1.9519
>> s = std(x)
s =
0.8967 1.4936
>> s1 = std(x,2)
错误使用 var (line 177)
W 必须为非负权重矢量,或者为标量 0 或 1。
出错 std (line 38)
y = sqrt(var(varargin{:}));
>> s1 = std(x,0)
s1 =
0.8967 1.4936
>> s1 = std(x,1)
s1 =
0.8388 1.3971
>> sk = skewness(x)
sk =
-0.0554 -0.3088
>> sk = skewness(x,1)
sk =
-0.0554 -0.3088
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
>>x=randn(8,2)
x=
1.0347-0.8095
0.7269-2.9443
-0.30341.4384
0.29390.3252
-0.7873-0.7549
0.88841.3703
-1.1471-1.7115
-1.0689-0.1022
>>dx=var(x)
dx=
0.80402.2308
>>dx1=var(x,1)
dx1=
0.70351.9519
>>s=std(x)
s=
0.89671.4936
>>s1=std(x,2)
错误使用var(line177)
W必须为非负权重矢量,或者为标量0或1。
出错std(line38)
y=sqrt(var(varargin{:}));
>>s1=std(x,0)
s1=
0.89671.4936
>>s1=std(x,1)
s1=
0.83881.3971
>>sk=skewness(x)
sk=
-0.0554-0.3088
>>sk=skewness(x,1)
sk=
-0.0554-0.3088
4.协方差(cov)与相关系数(corrcoef)
Python
>> x = ones(1,5)
x =
1 1 1 1 1
>> r = rand(5,1)
r =
0.3816
0.7655
0.7952
0.1869
0.4898
>> X = ones(5,5)
X =
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
>> A = magic(5)
A =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
11 18 25 2 9
>> C1 = cov(x)
C1 =
0
>> C2 = cov(r)
C2 =
0.0667
>> C3 = cov(x,r)
C3 =
0 0
0 0.0667
>> C4 = cov(r,x)
C4 =
0.0667 0
0 0
>> C5 = cov(X)
C5 =
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
>> C6 = cov(A)
C6 =
52.5000 5.0000 -37.5000 -18.7500 -1.2500
5.0000 65.0000 -7.5000 -43.7500 -18.7500
-37.5000 -7.5000 90.0000 -7.5000 -37.5000
-18.7500 -43.7500 -7.5000 65.0000 5.0000
-1.2500 -18.7500 -37.5000 5.0000 52.5000
>> C7 = corrcoef(x,r)
C7 =
NaN NaN
NaN 1
>> C8 = corrcoef(A,X)
C8 =
1 NaN
NaN NaN
>> C9 = corrcoef(A)
C9 =
1.0000 0.0856 -0.5455 -0.3210 -0.0238
0.0856 1.0000 -0.0981 -0.6731 -0.3210
-0.5455 -0.0981 1.0000 -0.0981 -0.5455
-0.3210 -0.6731 -0.0981 1.0000 0.0856
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
>>x=ones(1,5)
x=
11111
>>r=rand(5,1)
r=
0.3816
0.7655
0.7952
0.1869
0.4898
>>X=ones(5,5)
X=
11111
11111
11111
11111
11111
>>A=magic(5)
A=
17241815
23571416
46132022
101219213
11182529
>>C1=cov(x)
C1=
0
>>C2=cov(r)
C2=
0.0667
>>C3=cov(x,r)
C3=
00
00.0667
>>C4=cov(r,x)
C4=
0.06670
00
>>C5=cov(X)
C5=
00000
00000
00000
00000
00000
>>C6=cov(A)
C6=
52.50005.0000-37.5000-18.7500-1.2500
5.000065.0000-7.5000-43.7500-18.7500
-37.5000-7.500090.0000-7.5000-37.5000
-18.7500-43.7500-7.500065.00005.0000
-1.2500-18.7500-37.50005.000052.5000
>>C7=corrcoef(x,r)
C7=
NaNNaN
NaN1
>>C8=corrcoef(A,X)
C8=
1NaN
NaNNaN
>>C9=corrcoef(A)
C9=
1.00000.0856-0.5455-0.3210-0.0238
0.08561.0000-0.0981-0.6731-0.3210
-0.5455-0.09811.0000-0.0981-0.5455
-0.3210-0.6731-0.09811.00000.0856
五、统计作图
1.正整数频率表(tabulate)
Python
>> T = ceil(5*rand(1,10))
T =
3 4 4 4 2 4 4 1 1 3
>> table = tabulate(T)
table =
1 2 20
2 1 10
3 2 20
4 5 50
1
2
3
4
5
6
7
8
9
10
11
12
13
14
>>T=ceil(5*rand(1,10))
T=
3444244113
>>table=tabulate(T)
table=
1220
2110
3220
4550
其第一列为元素,第二列为出现次数,第三列为百分比
2.累计分布函数图形(cdfplot)
[h, (stats)] = cdfplot(x)
h 为表示曲线的句柄,x 为向量;stats 为样本的一些特征
Python
>> y = evrnd(0,3,100,1);
>> cdfplot(y)
>> hold on
>> x = -20 : .1 ; 10;
>> f = evcdf(x,0,3);
>> plot(x,f,'m')
1
2
3
4
5
6
>>y=evrnd(0,3,100,1);
>>cdfplot(y)
>>holdon
>>x=-20:.1;10;
>>f=evcdf(x,0,3);
>>plot(x,f,'m')
3.最小二乘拟合直线(lsline)
lsline
h = lsline
h为拟合曲线的句柄,该语句可实现离散数据的最小二乘拟合
Python
>> x = 1:10;
>> y1 = x + randn(1,10);
>> scatter(x,y1,25,'b','*')
>> hold on
>> y2 = 2*x + randn(1,10);
>> plot(x,y2,'mo')
>> lsline
1
2
3
4
5
6
7
>>x=1:10;
>>y1=x+randn(1,10);
>>scatter(x,y1,25,'b','*')
>>holdon
>>y2=2*x+randn(1,10);
>>plot(x,y2,'mo')
>>lsline
4.绘制正态分布概率图形(normplot)
h = normplot(X)
其中,若X为向量,则显示正态分布概率图形;若X为矩阵,则显示每一列的正态分布
Python
>> x = normrnd(10,1,25,1);
>> normplot(x) %绘制向量对象
>> figure
>> normplot([x,1.5*x]) %绘制矩阵对象
1
2
3
4
>>x=normrnd(10,1,25,1);
>>normplot(x)%绘制向量对象
>>figure
>>normplot([x,1.5*x])%绘制矩阵对象
5.样本数据的盒图(boxplot)
boxplot(X)
boxplot(X, G)
boxplot(axes, X, …)
boxplot(…, ‘name’, value)
X为待绘制的变量;G为附加变量;axes 为坐标轴句柄;name,value 为可设置属性的属性名和属性值
Python
>> x = randn(100,25);
>> subplot(311),boxplot(x)
>> subplot(312),boxplot(x,'plotstyle','compact')
>> subplot(313),boxplot(x,'notch','on')
1
2
3
4
>>x=randn(100,25);
>>subplot(311),boxplot(x)
>>subplot(312),boxplot(x,'plotstyle','compact')
>>subplot(313),boxplot(x,'notch','on')
看不懂盒图是什么
6.绘制参考线
refline 绘制参考直线,reflcurve 绘制参考曲线
refline(m, b)
refline(coeffs)
refline
hline = refline(…)
m 为斜率、b 为截距;coeffs 为前面两个参数构成的向量;hline 为参考线句柄
Python
>> x = 1 : 10;
>> y = x + randn(1,10);
>> scatter(x,y,25,'b','*')
>> lsline
>> mu = mean(y);
>> hline = refline([0 mu]);
>> set(hline, 'Color', 'r')
1
2
3
4
5
6
7
>>x=1:10;
>>y=x+randn(1,10);
>>scatter(x,y,25,'b','*')
>>lsline
>>mu=mean(y);
>>hline=refline([0mu]);
>>set(hline,'Color','r')
reflcurve
reflcurve(p)
hcurve = reflcurve(…)
p 为多项式系数向量
Python
>> p = [1 -2 -1 0];
>> t = 0 : .1 : 3;
>> y = polyval(p,t) + .5*randn(size(t));
>> plot(t,y,'ro')
>> h = refcurve(p);
>> set(h,'Color','r')
>> q = polyfit(t,y,3);
>> refcurve(q)
1
2
3
4
5
6
7
8
>>p=[1-2-10];
>>t=0:.1:3;
>>y=polyval(p,t)+.5*randn(size(t));
>>plot(t,y,'ro')
>>h=refcurve(p);
>>set(h,'Color','r')
>>q=polyfit(t,y,3);
>>refcurve(q)
7.样本概率图形(capaplot)
p = capaplot(data, specs)
[p, h] = capaplot(data, specs)
data 为样本数据,specs 用于指定范围,p表示在指定范围内的概率
该函数返回来自与估计分布的随机变量落在指定范围内的概率
Python
>> data = normrnd(3,.005,100,1);
>> p1 = capaplot(data,[2.99 3.01])
p1 =
0.9449
>> grid on; axis tight
>> figure
>> p2 = capaplot(data,[2.995 3.015])
p2 =
0.8037
>> grid on; axis tight
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>data=normrnd(3,.005,100,1);
>>p1=capaplot(data,[2.993.01])
p1=
0.9449
>>gridon;axistight
>>figure
>>p2=capaplot(data,[2.9953.015])
p2=
0.8037
>>gridon;axistight
8.正态拟合直方图(histfit)
histfit(data)
histfit(data, nbins)
histfit(data, nbins, dist)
h = histfit(…)
data 为向量;nbins 指定bar 的个数;dist为分布类型
Python
>> r = normrnd(10,1,200,1);
>> histfit(r)
>> h = get(gca,'Children');
>> set(h(2),'FaceColor',[.8 .8 1])
>> figure
>> histfit(r,20)
>> h = get(gca,'Children');
>> set(h(2),'FaceColor',[.8 .8 1])
1
2
3
4
5
6
7
8
>>r=normrnd(10,1,200,1);
>>histfit(r)
>>h=get(gca,'Children');
>>set(h(2),'FaceColor',[.8.81])
>>figure
>>histfit(r,20)
>>h=get(gca,'Children');
>>set(h(2),'FaceColor',[.8.81])
赞赏作者
喜欢 (1)or分享 (0)