pdist()是一个计算距离的函数,得到的是一个对称矩阵,其中对角线为0。squareform()函数是对pdist()函数返回的矩阵的上三角形进行处理,然后从第一行开始取值,返回一个数组,变成一个稀疏矩阵,同时spuareform()函数还可以进行逆运算,把一个稀疏矩阵生成一个非稀疏矩阵。比如:欧氏距离,计算欧氏距离的例子如下:
距离计算规则如图:
from scipy.spatial.distance import pdist, squareform
x=np.random.rand(3)
print(x)
[0.15297464 0.24703217 0.95730516]
y=np.random.rand(3)
print(y)
[0.71366508 0.36111799 0.5305526 ]
x1=np.random.rand(5)
print(x1)
[0.93979333 0.21137943 0.55437114 0.15531124 0.06513814]
y1=np.random.rand(5)
print(y1)
[0.17426077 0.51733583 0.93023109 0.28271652 0.40983225]
A=np.vstack((x,y)).T
print(A)
[[0.15297464 0.71366508]
[0.24703217 0.36111799]
[0.95730516 0.5305526 ]]
B=np.vstack((x1,y1)).T
print(B)
[[0.93979333 0.17426077]
[0.21137943 0.51733583]
[0.55437114 0.93023109]
[0.15531124 0.28271652]
[0.06513814 0.40983225]]
C=np.vstack((A,B))
print(C)
[[0.15297464 0.71366508]
[0.24703217 0.36111799]
[0.95730516 0.5305526 ]
[0.93979333 0.17426077]
[0.21137943 0.51733583]
[0.55437114 0.93023109]
[0.15531124 0.28271652]
[0.06513814 0.40983225]]
distance=pdist(C, 'euclidean')
print(distance)
[0.36487843 0.82491076 0.95396051 0.20483235 0.45609208 0.43095489
0.31627463 0.73020258 0.71751909 0.16023461 0.64679746 0.12066283
0.18830432 0.35672193 0.74604282 0.56753741 0.83941466 0.9002974
0.80516291 0.84855252 0.79194369 0.90582318 0.53677352 0.24122575
0.18150354 0.7606076 0.71425758 0.1558512 ]
D=squareform(distance)
print(D)
[[0. 0.36487843 0.82491076 0.95396051 0.20483235 0.45609208
0.43095489 0.31627463]
[0.36487843 0. 0.73020258 0.71751909 0.16023461 0.64679746
0.12066283 0.18830432]
[0.82491076 0.73020258 0. 0.35672193 0.74604282 0.56753741
0.83941466 0.9002974 ]
[0.95396051 0.71751909 0.35672193 0. 0.80516291 0.84855252
0.79194369 0.90582318]
[0.20483235 0.16023461 0.74604282 0.80516291 0. 0.53677352
0.24122575 0.18150354]
[0.45609208 0.64679746 0.56753741 0.84855252 0.53677352 0.
0.7606076 0.71425758]
[0.43095489 0.12066283 0.83941466 0.79194369 0.24122575 0.7606076
0. 0.1558512 ]
[0.31627463 0.18830432 0.9002974 0.90582318 0.18150354 0.71425758
0.1558512 0. ]]
squareform(D)
array([0.36487843, 0.82491076, 0.95396051, 0.20483235, 0.45609208,
0.43095489, 0.31627463, 0.73020258, 0.71751909, 0.16023461,
0.64679746, 0.12066283, 0.18830432, 0.35672193, 0.74604282,
0.56753741, 0.83941466, 0.9002974 , 0.80516291, 0.84855252,
0.79194369, 0.90582318, 0.53677352, 0.24122575, 0.18150354,
0.7606076 , 0.71425758, 0.1558512 ])