Calculation of Vector Similarity

It is a quite fundamental technique in machine learning and other fields to calculate the similarity of two vectors.

Given two vectors of n dimensions, as :

       

1. Euclidean Distance. (as Ed, Most frequently seen)

        In fact, it's just straight-line distance of the two sample points (vectors) in a multi-dimensional space.

        We have:

               

        The formula above tells us how similar two vectors are by the value of Ed1, the smaller, the more similar. When Ed1 equals 0, we deem they are identical vectors. Also there is another alternative:

              

               

        This seems more reasonable since a bigger value of Ed2 means greater similarity. And notice that Ed2 is within interval (0,1].


2. Pearson Correlation. (as PC)

        PC is slightly more sophisticated than Ed. A pearson correlation coefficient (r) is generated to measure how well two sets of data fit on a straight line.

        Formula to calculate (r) goes like what follows:

     

        r is distributed between [-1,1], the bigger | r | (absolute) is, the more they are related. A positive r means they are positively correlated. And a zero value of r means they are not related. Pearson correlation approach works even when vector dimensions are not quite well normalized.


        

       





  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值