机器学习中的常用距离

If x1,x2Rn, then:
闵可夫斯基距离 Minkowski Distance

d12=k=1n(x1kx2k)pp,p>0

欧氏距离 Enclidean Distance
L2 norm

d12=k=1n(x1kx2k)2 or d12=(x1x2)T(x1x2)

标准化欧式距离/加权欧式距离 Weighted Euclidean Distance

d12=k=1n(x1kx2kSk)2

where Sk is the standard deviation.
from numpy import *
vectormat=mat([[1,2,3],[4,5,6]])
v12=vectormat[0]-vectormat[1]
varmat=std(vectormat.T, axis=0)
normmat=(vectormat-mean(vectormat))/varmat.T
normv12=normmat[0]-normmat[1]
print(sqrt(normv12*normv12.T))

曼哈顿距离 Manhattan Distance
L1 norm

d12=k=1n|x1kx2k|

切比雪夫距离 Chebyshev Distance
L norm

d12=maxi(|x1ix2i|)
from numpy import *
vector1=mat([1,2,3])
vector2=mat([4,5,7])
print(abs(vector1-vector2).max())

夹角余弦 Cosine

cosθ=nk=1x1kx2knk=1x21knk=1x22k

汉明距离 Hamming Distance
In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other. (referred from Wikipedia)

from numpy import *
matV=mat([[1,1,0,1,0,1,0,0,1],[0,1,1,0,0,0,1,1,1]])
smstr=nonzero(matV [0]-matV[1])
print(shape(smstr[0])[0])

杰卡德相似系数 Jaccard Similarity Coefficient
Given two sets, A and B, the Jaccard similarity coefficient is defined as

J(A,B)=|AB||AB|

杰卡德距离 Jaccard Distance

Jδ(A,B)=1J(A,B)=|AB||AB||AB|
from numpy import *
import scipy.spatial.distance as dist
matV=mat([[1,1,0,1,0,1,0,0,1],[0,1,1,0,0,0,1,1,1]])
print(dist.pdist(matV,'jaccard'))

马氏距离 Mahalanobis Distance
Given m sample vectors X1,,Xm whose mean value is μ and covariance matrix is S, then the Mahalanobis distance of sample vector X and μ is defined as

D(X)=(Xμ)TS1(Xμ)

that of sample vector Xi and Xj is
D(X)=(XiXj)TS1(XiXj)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值