Algorithmic Toolbox 算法基础学习笔记(Closest Pair of Points using Divide and Conquer alogrithm)

Algorithmic Toolbox 算法基础学习笔记

【第四周习题 节选】Closest Pair of Points using Divide and Conquer alogrithm


1. Problem Statement

1.1. Task

在这个问题中,你的目标是在给定的𝑛个点中找到距离最接近的一对点。
这是计算几何学中的一个基本原理,在图形、计算机视觉、交通控制系统等领域都有应用。例如,图形学、计算机视觉、交通管制系统等。
Given 𝑛 points on a plane, find the smallest distance between a pair of two (different) points. Recall
that the distance between points ( 𝑥 1 , 𝑦 1 ) (𝑥_1, 𝑦_1) (x1,y1) and ( 𝑥 2 , 𝑦 2 ) (𝑥_2, 𝑦_2) (x2,y2) is equal to ( 𝑥 1 − 𝑥 2 ) 2 + ( 𝑦 1 − 𝑦 2 ) 2 . \sqrt{(𝑥_1 − 𝑥_2) ^2 + (𝑦_1 − 𝑦_2)^2} . (x1x2)2+(y1y2)2 .

1.2. Input Format

The first line contains the number 𝑛 of points. Each of the following 𝑛 lines defines a point ( 𝑥 𝑖 , 𝑦 𝑖 ) (𝑥_𝑖 , 𝑦_𝑖) (xi,yi).

1.3. Constraints

2 ≤ 𝑛 ≤ 105 2 ≤ 𝑛 ≤ 105 2n105; − 109 ≤ 𝑥 𝑖 , 𝑦 𝑖 ≤ 109 −109 ≤ 𝑥_𝑖, 𝑦_𝑖 ≤ 109 109xi,yi109 are integers.

1.4. Output Format

Output the minimum distance. The absolute value of the difference between the answer
of your program and the optimal value should be at most 1 0 − 3 10^{−3} 103. To ensure this, output your answer
with at least four digits after the decimal point (otherwise your answer, while being computed correctly,
can turn out to be wrong because of rounding issues).


2. Brute Force Approach

Checking every possible pair of points; takes quadratic time.
距离最近坐标对问题的蛮力解法(即检查每对可能的坐标)时间复杂度为 O ( n 2 ) O(n^2) O(n2)。我们需要通过分治算法来更快地解决该问题。

#include <algorithm>
#include <iostream>
#include <sstream>
#include <iomanip>
#include <vector>
#include <string>
#include <cmath>

using std::vector;
using std::string;
using std::min;
//define a structure 'coord' to store each pair of point
struct coord {
	int x;
	int y;
}coord_unit;

//the brute force approach
double bruteForce(vector<coord> distance20) {
	double min_dis = 10000000000000.0;
	for (int i = 0; i < distance20.size(); i++) {
		for (int j = i+1; j < distance20.size(); j++) {
			if (find_dist(distance20[i], distance20[j]) < min_dis) {
				min_dis = find_dist(distance20[i], distance20[j]);
			}
		}
		return min_dis;
	}
}
//the main function
int main() {
  size_t n;
  std::cin >> n;
  vector<int> x(n);
  vector<int> y(n);
  for (size_t i = 0; i < n; i++) {
    std::cin >> x[i] >> y[i];
  }
//define a vector distance20 to store the coord pairs
  vector<coord> distance20;
  for (int i = 0; i < x.size(); i++) {
	  coord_unit.x = x[i];
	  coord_unit.y = y[i];
	  distance20.push_back(coord_unit);
  }
  std::cout << std::fixed;
  std::cout << std::setprecision(9) << bruteForce(distance20) << "\n";
}

3. Divide and Conquer Approach

3.1. Closest Pair in the Plane

如下图所示,给定一个平面上的点集 S S S,我们用一条垂直线 l l l将其划分为两个子集 S 1 S_1 S1 S 2 S_2 S2,这样 S 1 S_1 S1中的点就在 l l l的左边, S 2 S_2 S2中的点就在 l l l的右边。
现在我们递归地解决这两个集合的问题,获得最小距离d1(对于S1)和d2(对于S2)。我们让d是其中的最小值。

现在,如果整个集合中距离最短的两点分别在两个子集中,那么这两个点必须在 l l l d d d之内。这个区域被表示为 l l l两侧的两个条带 P 1 P_1 P1 P 2 P_2 P2(如图所示)。
在这里插入图片描述图1:二维平面中的分而治之算法示意图

我们希望确定 P 1 P_1 P1中的某个点与 P 2 P_2 P2中的另一个点的距离是否小于 d d d。然而,在平面上,所有的点都可能在条带中!这是个灾难性的现象,因为我们将不得不比较 n 2 n^2 n2对点来合并这个集合,因此我们的分而治之算法在效率不会有任何提升!
值得庆幸的是,我们观察到,对于一个条带中的任何特定点 p p p,只有在另一个条带中满足以下约束的点才需要被检查。

  1. 在另一条带的方向上,距点 p p p d d d范围内的那些点
  2. 在正和负的y轴方向上,距点 p p p d d d范围内的那些点。

这是因为超出这个边界框外的点与点 p p p的距离将超过 d d d个单位(见图3.3)。碰巧的是,因为这个盒子里的每一个点都至少相距 d d d,所以盒子里最多可以有七个点(证明过程见附录)。这简直是个好消息,因为现在我们不需要检查所有 n 2 n^2 n2个点。我们所要做的就是按y坐标对条带中的点进行排序,并按顺序扫描这些点,每个点最多与7个相邻的点进行检查。这意味着最多需要7*n次比较来检查所有候选对。
然而,由于我们将条带中的点按其y坐标排序,因此合并两个子集的过程不是线性的,而是实际上需要 O ( n l o g n ) O(nlogn) O(nlogn)时间。

在这里插入图片描述图2:找条带中的某一个点与其他点的最小距离的时间是常数

3.2. 算法与时间复杂度分析

  1. 通过直线 l l l将该集合分成两个大小相等的部分,并递归地计算每个部分的最小距离。
  2. d d d为两个最小距离中的最小值。
  3. 排除那些与 l l l相距超过 d d d的点。
  4. 根据Y坐标对剩余的点进行排序
  5. 按照y的顺序扫描剩余的点,计算每个点与它的七个邻居的距离。
  6. 如果这些距离中的任何一个小于d,则更新d。

步骤2-6定义了合并过程,由于这是一个分而治之的算法,所以必须重复数次。

  • 第2步需要O(1)时间
  • 步骤3需要O(n)时间
  • 第4步是一个排序,需要O(nlogn)时间
  • 第5步需要O(n)时间(正如我们在上一节中看到的)。
  • 第6步需要O(1)时间

因此,子解决方案的合并被步骤4的排序所支配,因此需要O(nlogn)时间。
在分而治之的算法中,每一级递归都必须重复一次。

因此整个ClosestPair算法需要 O ( l o g n ∗ n l o g n ) = O ( n l o g 2 n ) O(logn*nlogn)=O(nlog2n) O(lognnlogn)=O(nlog2n)的时间。
代码如下(示例):

#include <algorithm>
#include <iostream>
#include <sstream>
#include <iomanip>
#include <vector>
#include <string>
#include <cmath>

using std::vector;
using std::string;
using std::pair;
using std::min;

struct coord {
	int x;
	int y;
}coord_unit;

//coord coord_unit;

bool compare_distance(coord coord1, coord coord2) {
	double d1 =std::sqrt( std::pow(coord1.x, 2) + pow(coord1.y, 2));
	double d2 = std::sqrt(std::pow(coord2.x, 2) + pow(coord2.y, 2));
	return (d1 < d2);
}

bool compare_X(coord coord1, coord coord2) {
	return coord1.x < coord2.x;
}


bool compare_Y(coord coord1, coord coord2) {
	return coord1.y < coord2.y;
}


double find_dist(coord coord1, coord coord2) {
	return (std::sqrt(std::pow((coord1.x - coord2.x), 2) + std::pow((coord1.y - coord2.y), 2)));
}


double bruteForce(vector<coord> distance20) {
	double min_dis = 10000000000000.0;
	for (int i = 0; i < distance20.size(); i++) {
		for (int j = i+1; j < distance20.size(); j++) {
			if (find_dist(distance20[i], distance20[j]) < min_dis) {
				min_dis = find_dist(distance20[i], distance20[j]);
			}
		}
		return min_dis;
	}
}

// A utility function to find the
// distance between the closest points of
// strip of given size. All points in
// strip[] are sorted according to
// y coordinate. They all have an upper
// bound on minimum distance as d.
// Note that this method seems to be
// a O(n^2) method, but it's a O(n)
// method as the inner loop runs at most 6 times
double min_strip(vector<coord> strip, double d) {
	double min = d; // Initialize the minimum distance as d

	std::sort(strip.begin(), strip.end(), compare_Y);
	// Pick all points one by one and try the next points till the difference
	// between y coordinates is smaller than d.
	// This is a proven fact that this loop runs at most 6 times
	for (int i = 0; i < strip.size(); ++i)
		for (int j = i + 1; j < strip.size() && (strip[j].y - strip[i].y) < min; ++j)
			if (find_dist(strip[i], strip[j]) < min)
				min = find_dist(strip[i], strip[j]);

	return min;
}


double minimal_distance(vector<coord> distance20) {

	//base case
	if (distance20.size() < 3) {
		return  bruteForce(distance20);
	}
	//divide
	int mid = distance20.size() / 2; // 7/2 = 3
	vector<coord>::const_iterator first = distance20.begin();
	vector<coord>::const_iterator middle = distance20.begin() + mid;
	vector<coord>::const_iterator last = distance20.end();
	vector<coord> distance_l(first, middle);
	vector<coord> distance_r(middle, last);
	double d1 = minimal_distance(distance_l); // 1, 2, 3
	double dr = minimal_distance(distance_r); // 4, 5, 6, 7

	//conquer
	double d = min(d1, dr);
	vector<coord> mid_strip;
	for (int i = 0; i < distance20.size(); i++) {
		if (abs(distance20[i].x - distance20[mid].x) < d)
			mid_strip.push_back(distance20[i]);
	}
	return min(d, min_strip(mid_strip, d));
}

int main() {
  size_t n;
  std::cin >> n;
  vector<int> x(n);
  vector<int> y(n);
  for (size_t i = 0; i < n; i++) {
    std::cin >> x[i] >> y[i];
  }

  vector<coord> distance20;
  for (int i = 0; i < x.size(); i++) {
	  coord_unit.x = x[i];
	  coord_unit.y = y[i];
	  distance20.push_back(coord_unit);
  }
  std::sort(distance20.begin(), distance20.end(), compare_X);
  std::cout << std::fixed;
  std::cout << std::setprecision(9) << minimal_distance(distance20) << "\n";
}


4. 算法改进

我们可以通过减少在步骤 4 中实现y坐标排序所需的时间来稍微改进该算法。这是通过要求根据 y 坐标对所有点进行预排序,令排序后的数组为 P y [ ] Py[] Py[],当我们进行递归调用时,我们还需要根据垂直线划分 P y [ ] Py[] Py[] 的点。
在步骤 1 中计算的递归解决方案返回按y坐标排序顺序的点来完成的。这将产生两个排序的点列表,只需在步骤 4 中合并(线性时间操作)即可生成完整的排序列表。因此,修订后的算法涉及进行以下更改:

  • 步骤0:首先对集合按y坐标预排序,使之成为一个单独的数组 P y [ ] Py[] Py[]
  • 步骤1:通过直线l将该集合分成两个大小相等的部分,并递归计算每个部分的距离,按y坐标预排序返回每个集合中的点。
  • 步骤 4:在 O ( n ) O( n ) O(n)时间内将两个排序列表合并为一个排序列表。

因此,合并过程现在由线性时间步长主导,从而产生 O ( n l o g n ) O( n log n ) O(nlogn) 算法,用于找到平面中一组点的最小距离点对。

4.2. 代码(from geeksforgeek)

// A divide and conquer program in C++ to find the smallest distance from a
// given set of points.
  
#include <iostream>
#include <float.h>
#include <stdlib.h>
#include <math.h>
using namespace std;
  
// A structure to represent a Point in 2D plane
struct Point
{
    int x, y;
};
  
  
/* Following two functions are needed for library function qsort().
   Refer: http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/ */
  
// Needed to sort array of points according to X coordinate
int compareX(const void* a, const void* b)
{
    Point *p1 = (Point *)a,  *p2 = (Point *)b;
    return (p1->x - p2->x);
}
// Needed to sort array of points according to Y coordinate
int compareY(const void* a, const void* b)
{
    Point *p1 = (Point *)a,   *p2 = (Point *)b;
    return (p1->y - p2->y);
}
  
// A utility function to find the distance between two points
float dist(Point p1, Point p2)
{
    return sqrt( (p1.x - p2.x)*(p1.x - p2.x) +
                 (p1.y - p2.y)*(p1.y - p2.y)
               );
}
  
// A Brute Force method to return the smallest distance between two points
// in P[] of size n
float bruteForce(Point P[], int n)
{
    float min = FLT_MAX;
    for (int i = 0; i < n; ++i)
        for (int j = i+1; j < n; ++j)
            if (dist(P[i], P[j]) < min)
                min = dist(P[i], P[j]);
    return min;
}
  
// A utility function to find a minimum of two float values
float min(float x, float y)
{
    return (x < y)? x : y;
}
  
  
// A utility function to find the distance between the closest points of
// strip of a given size. All points in strip[] are sorted according to
// y coordinate. They all have an upper bound on minimum distance as d.
// Note that this method seems to be a O(n^2) method, but it's a O(n)
// method as the inner loop runs at most 6 times
float stripClosest(Point strip[], int size, float d)
{
    float min = d;  // Initialize the minimum distance as d
  
    // Pick all points one by one and try the next points till the difference
    // between y coordinates is smaller than d.
    // This is a proven fact that this loop runs at most 6 times
    for (int i = 0; i < size; ++i)
        for (int j = i+1; j < size && (strip[j].y - strip[i].y) < min; ++j)
            if (dist(strip[i],strip[j]) < min)
                min = dist(strip[i], strip[j]);
  
    return min;
}
  
// A recursive function to find the smallest distance. The array Px contains
// all points sorted according to x coordinates and Py contains all points
// sorted according to y coordinates
float closestUtil(Point Px[], Point Py[], int n)
{
    // If there are 2 or 3 points, then use brute force
    if (n <= 3)
        return bruteForce(Px, n);
  
    // Find the middle point
    int mid = n/2;
    Point midPoint = Px[mid];
  
  
    // Divide points in y sorted array around the vertical line.
    // Assumption: All x coordinates are distinct.
    Point Pyl[mid];   // y sorted points on left of vertical line
    Point Pyr[n-mid];  // y sorted points on right of vertical line
    int li = 0, ri = 0;  // indexes of left and right subarrays
    for (int i = 0; i < n; i++)
    {
      if (Py[i].x <= midPoint.x && li<mid)
         Pyl[li++] = Py[i];
      else
         Pyr[ri++] = Py[i];
    }
  
    // Consider the vertical line passing through the middle point
    // calculate the smallest distance dl on left of middle point and
    // dr on right side
    float dl = closestUtil(Px, Pyl, mid);
    float dr = closestUtil(Px + mid, Pyr, n-mid);
  
    // Find the smaller of two distances
    float d = min(dl, dr);
  
    // Build an array strip[] that contains points close (closer than d)
    // to the line passing through the middle point
    Point strip[n];
    int j = 0;
    for (int i = 0; i < n; i++)
        if (abs(Py[i].x - midPoint.x) < d)
            strip[j] = Py[i], j++;
  
    // Find the closest points in strip.  Return the minimum of d and closest
    // distance is strip[]
    return stripClosest(strip, j, d);
}
  
// The main function that finds the smallest distance
// This method mainly uses closestUtil()
float closest(Point P[], int n)
{
    Point Px[n];
    Point Py[n];
    for (int i = 0; i < n; i++)
    {
        Px[i] = P[i];
        Py[i] = P[i];
    }
  
    qsort(Px, n, sizeof(Point), compareX);
    qsort(Py, n, sizeof(Point), compareY);
  
    // Use recursive function closestUtil() to find the smallest distance
    return closestUtil(Px, Py, n);
}
  
// Driver program to test above functions
int main()
{
    Point P[] = {{2, 3}, {12, 30}, {40, 50}, {5, 1}, {12, 10}, {3, 4}};
    int n = sizeof(P) / sizeof(P[0]);
    cout << "The smallest distance is " << closest(P, n);
    return 0;
}

5. 附录

证明过程:
https://www.cs.mcgill.ca/~cs251/ClosestPair/proofbox.html

参考链接:
3 Closest Pair: A Divide-and-Conquer Approach
geekforgeeks

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值