暴力搜索所有点对的距离,时间复杂度为O(n^2),在数据规模较为庞大的情况下,并不能很好的运行。
以下将利用分治的方法,将时间复杂度降为O(n (lg n)^2 )。
原理
1.将主问题按坐标划分成两个子问题——以直线l:x=k为中心,使得两边的点数相同。
2.当子问题求解完毕后,设ε为已经求解出的最小距离,然后以 l1:x=k-ε 和 l2:x=k+ε 为界,选取该区域的所有点,并按y轴,从小到大排序。
3.按顺序选择一个点,并分别计算改点往后7个点的距离,选择最小的距离返回。
伪代码
MERGE( A, left, right )
if left == right
return ∞
else if left + 1 == right
return GET_DISTANCE( A[left], A[right] )
mid = (left + right)/2
dis1 = MERGE( A, left, mid )
dis2 = MERGE( A, mid+1, right )
dis = MIN( dis1, dis2 )
let TMP is an array
for i=left to right
if | A[i].x-A[mid].x | < dis
TMP.insert(A[i])
for i=0 to TMP.size
for j=1 to 7
if i+j >= TMP.size
break
if dis < GET_DISTANCE( TMP[i], TMP[i+j] )
dis = GET_DISTANCE( TMP[i], TMP[i+j]
return dis
GET_COLSEST_DISTANCE( A )
A.sort() with A[i].x < A[j] and i < j
return MERGE( A, 0, A.size-1 )
代码实现
#include <cmath>
#include <algorithm>
const double DOUBLE_INFINITE = 1e50;
struct Point
{
int x,y;
};
bool cmpX( Point a, Point b )
{
if( a.x == b.x )
return a.y < b.y;
return a.x < b.x;
}
bool cmpY( Point a, Point b )
{
if( a.y == b.y )
return a.x < b.x;
return a.y < b.y;
}
double getDistance( Point a, Point b )
{
return sqrt( pow( a.x-b.x, 2.0 ) + pow( a.y-b.y, 2.0 ) );
}
void getClosestDistance( Point* points, Point* tmp, int left, int right, double &minDistance )
{
if( right-left <=4 ) {
double distance = DOUBLE_INFINITE;
for( int i=left;i<=right;i++ ) {
for( int j=i+1; j<=right; j++ ) {
distance = getDistance( points[i], points[j] );
minDistance = minDistance<distance? minDistance : distance;
}
}
return ;
}
int mid = (left + right)>>1;
getClosestDistance( points, tmp, left, mid, minDistance );
getClosestDistance( points, tmp, mid+1, right, minDistance );
int length = 0;
for(int i=left;i<=right;i++){
if( fabs(points[i].x - points[mid].x ) < minDistance )
tmp[length++] = points[i];
}
std::sort( tmp, tmp+length,cmpY );
double distance = DOUBLE_INFINITE;
for(int i=0;i<length;i++) {
for( int j=1; j<8 && i+j<length; j++) {
distance = getDistance( tmp[i], tmp[i+j] );
minDistance = minDistance<distance? minDistance : distance;
}
}
}
double getClosestDistanceOfPoints( Point* points, int sizeOfPoints, Point* tmp )
{
std::sort( points, points+sizeOfPoints, cmpX );
double minDistance = DOUBLE_INFINITE;
getClosestDistance( points, tmp, 0, sizeOfPoints-1, minDistance );
return minDistance;
}
写在最后
虽然最近点对有时间复杂度为 O(n lg n )的算法,但是实现其算法时,常数过于巨大,使得运行效率比O(n (lg n)^2 )更低。