matlab扩展的欧几里得算法,在Matlab中高效计算成对平方的欧几里得距离

Given two sets of d-dimensional points. How can I most efficiently compute the pairwise squared euclidean distance matrix in Matlab?

Notation:

Set one is given by a (numA,d)-matrix A and set two is given by a (numB,d)-matrix B. The resulting distance matrix shall be of the format (numA,numB).

Example points:

d = 4; % dimension

numA = 100; % number of set 1 points

numB = 200; % number of set 2 points

A = rand(numA,d); % set 1 given as matrix A

B = rand(numB,d); % set 2 given as matrix B

解决方案

The usually given answer here is based on bsxfun (cf. e.g. [1]). My proposed approach is based on matrix multiplication and turns out to be much faster than any comparable algorithm I could find:

helpA = zeros(numA,3*d);

helpB = zeros(numB,3*d);

for idx = 1:d

helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];

helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 , B(:,idx), ones(numB,1)];

end

distMat = helpA * helpB';

Please note:

For constant d one can replace the for-loop by hardcoded implementations, e.g.

helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,1), A(:,1).^2, ... % d == 2

ones(numA,1), -2*A(:,2), A(:,2).^2 ]; % etc.

Evaluation:

%% create some points

d = 2; % dimension

numA = 20000;

numB = 20000;

A = rand(numA,d);

B = rand(numB,d);

%% pairwise distance matrix

% proposed method:

tic;

helpA = zeros(numA,3*d);

helpB = zeros(numB,3*d);

for idx = 1:d

helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];

helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 , B(:,idx), ones(numB,1)];

end

distMat = helpA * helpB';

toc;

% compare to pdist2:

tic;

pdist2(A,B).^2;

toc;

% compare to [1]:

tic;

bsxfun(@plus,dot(A,A,2),dot(B,B,2)')-2*(A*B');

toc;

% Another method: added 07/2014

% compare to ndgrid method (cf. Dan's comment)

tic;

[idxA,idxB] = ndgrid(1:numA,1:numB);

distMat = zeros(numA,numB);

distMat(:) = sum((A(idxA,:) - B(idxB,:)).^2,2);

toc;

Result:

Elapsed time is 1.796201 seconds.

Elapsed time is 5.653246 seconds.

Elapsed time is 3.551636 seconds.

Elapsed time is 22.461185 seconds.

For a more detailed evaluation w.r.t. dimension and number of data points follow the discussion below (@comments). It turns out that different algos should be preferred in different settings. In non time critical situations just use the pdist2 version.

Further development:

One can think of replacing the squared euclidean by any other metric based on the same principle:

help = zeros(numA,numB,d);

for idx = 1:d

help(:,:,idx) = [ones(numA,1), A(:,idx) ] * ...

[B(:,idx)' ; -ones(1,numB)];

end

distMat = sum(ANYFUNCTION(help),3);

Nevertheless, this is quite time consuming. It could be useful to replace for smaller d the 3-dimensional matrix help by d 2-dimensional matrices. Especially for d = 1 it provides a method to compute the pairwise difference by a simple matrix multiplication:

pairDiffs = [ones(numA,1), A ] * [B'; -ones(1,numB)];

Do you have any further ideas?

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值