使用std::lower_bound和std::upper_bound解决常见的二分查找问题

最新推荐文章于 2024-07-15 08:48:43 发布

anakin7

最新推荐文章于 2024-07-15 08:48:43 发布

阅读量4.2k

点赞数 1

分类专栏：算法 C++ 文章标签： STL 二分查找算法 lower_bound upper_bound

本文链接：https://blog.csdn.net/anakin7/article/details/71055747

版权

算法同时被 2 个专栏收录

3 篇文章 0 订阅

订阅专栏

C++

2 篇文章 0 订阅

订阅专栏

常见二分查找的问题有如下几种：

1，有序数组查找特定的某个值。

2，有序数组查找小于某个值的数字中最大的那个。

3，有序数组查找小于或等于某个值的数字中的最大的那个。

4，有序数组查找大于某个值的数字中最小的那个。

5，有序数组查找大于或等于某个值的数字中的最小的那个。

这里的有序数组指的是升序。

第1种情况最简单，这里略去不谈。

第2种和第3种可以归为同一类，视为求下界。

第4种和第5种可以归为同一类，视为求上界。

文章末尾的第一篇参考资料中给出了这几种情况的实现代码。

除了自己实现之外，STL中也为我们提供了很好的实现，即std::lower_bound和std::upper_bound。

std::lower_bound有两种声明：

template< class ForwardIt, class T >
ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value );
template< class ForwardIt, class T, class Compare >
ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value, Compare comp );

cppreference对此的描述是返回区间[first, last)内第一个不小于（即大于或等于）给定值的元素指针。

它的第一种声明的代码实现如下：

template<class ForwardIt, class T>
ForwardIt lower_bound(ForwardIt first, ForwardIt last, const T& value)
{
    ForwardIt it;
    typename std::iterator_traits<ForwardIt>::difference_type count, step;
    count = std::distance(first, last);
 
    while (count > 0) {
        it = first; 
        step = count / 2; 
        std::advance(it, step);
        if (*it < value) {
            first = ++it; 
            count -= step + 1; 
        }
        else
            count = step;
    }
    return first;
}

经过分析源码，可以得知在二分过程中first指针在使条件(*it < value)成立的情况下不断右移，但它最终返回的是第一个使条件不成立的元素。它可以直接解决第5种情况的问题。其实这里这与第2种情况也有些类似，但是第2种情况需要返回的是最后一个使条件成立的元素。如果我们需要使用std::lower_bound解决第2种情况，只需要取std::lower_bound返回值左边相邻的元素即可，如果返回值已经是最左边的元素，说明要找的值不存在。

STL还提供了另外一种实现，它支持使用者自己提供条件函数。它的实现如下：

template<class ForwardIt, class T, class Compare>
ForwardIt lower_bound(ForwardIt first, ForwardIt last, const T& value, Compare comp)
{
    ForwardIt it;
    typename std::iterator_traits<ForwardIt>::difference_type count, step;
    count = std::distance(first,last);
 
    while (count > 0) {
        it = first;
        step = count / 2;
        std::advance(it, step);
        if (comp(*it, value)) {
            first = ++it;
            count -= step + 1;
        }
        else
            count = step;
    }
    return first;
}

我们可以将此实现描述为 返回区间[first, last)内第一个使comp为false的元素指针。区间[first, last)与comp需要满足一定的性质关系，即左部分区间的元素使得comp为true，右部分区间的元素使得comp为false。我们可以使用自己实现的comp并应用上面提到的办法解决第3种情况的问题。

同样地，std::upper_bound也有两种声明：

template< class ForwardIt, class T >
ForwardIt upper_bound( ForwardIt first, ForwardIt last, const T& value );
template< class ForwardIt, class T, class Compare >
ForwardIt upper_bound( ForwardIt first, ForwardIt last, const T& value, Compare comp );

cppreference对此的描述是 返回区间[first, last)内第一个大于给定值的元素指针。
它的第一种声明的代码实现如下：

template<class ForwardIt, class T>
ForwardIt upper_bound(ForwardIt first, ForwardIt last, const T& value)
{
    ForwardIt it;
    typename std::iterator_traits<ForwardIt>::difference_type count, step;
    count = std::distance(first,last);
 
    while (count > 0) {
        it = first; 
        step = count / 2; 
        std::advance(it, step);
        if (!(value < *it)) {
            first = ++it;
            count -= step + 1;
        } else count = step;
    }
    return first;
}

std::upper_bound与std::lower_bound的实现代码仅有一处不同即比较条件(!(value < *it))，它等价于(value >= *it)。这段代码使用std::lower_bound的语义描述为返回第一个使(value >= *it)不成立的元素，也就是第一个使得(*it > value)成立的元素，这也就是std::upper_bound的语义。它可以直接用来解决上面第4种情况的问题。
std::upper_bound也支持使用者自己提供条件函数，它的代码如下：

template<class ForwardIt, class T, class Compare>
ForwardIt upper_bound(ForwardIt first, ForwardIt last, const T& value, Compare comp)
{
    ForwardIt it;
    typename std::iterator_traits<ForwardIt>::difference_type count, step;
    count = std::distance(first,last);
 
    while (count > 0) {
        it = first; 
        step = count / 2;
        std::advance(it, step);
        if (!comp(value, *it)) {
            first = ++it;
            count -= step + 1;
        } else count = step;
    }
    return first;
}

我们可以将此实现描述为 返回区间[first, last)内第一个使comp为true的元素指针。区间[first, last)与comp需要满足一定的性质关系，即左部分区间的元素使得comp为false，右部分区间的元素使得comp为true。我们可以自己实现一个comp函数，可以用它来解决上面的第5种情况的问题。
注意，std::lower_bound和std::upper_bound中两个comp函数并不一样，它们的参数顺序不同。

观察上面的代码，可以发现std::lower_bound和std::upper_bound的实现中只在移动first指针而不曾移动last指针，这与我们自己平时实现的二分算法（参考本文末尾第一篇文章中的实现代码）有很大不同。我想这跟STL容器的设计有关，毕竟last指针总是一个常量end()，是无法移动的。这种实现使得面对第2和第3种情况下的问题无法直接解决，需要往左移动。

我写了一个简单的代码，用std::lower_bound和std::upper_bound解决上面提到的几种情况。

它使用了gtest和c++11。

#include <gtest/gtest.h>
#include <algorithm>
#include <vector>

typedef std::vector<int>::const_iterator iter;
const int NUMBER_SIZE = 100000;
const int TEST_COUNT = 100;

int GenARandomNumber() {
  return ::rand() % 1000;
}

std::vector<int> GenerateRandomSortedVector(int n) {
  std::vector<int> numbers(n);
  for (int i = 0; i < n; ++i) {
    numbers[i] = GenARandomNumber();
  }
  std::sort(numbers.begin(), numbers.end());
  return numbers;
}

void ShowNumbers(const std::vector<int> & numbers) {
  for (auto num: numbers) {
    std::cout << num << " ";
  }
  std::cout << std::endl;
}

template<class Compare>
iter FindFirstTrue(const std::vector<int> & numbers, int  target, Compare comp) {
  auto it = numbers.begin();
  for (; it != numbers.end(); ++it) {
    if (comp(*it, target)) {
      break;
    }
  }
  return it;
}

template<class Compare>
iter FindFirstFalse(const std::vector<int> & numbers, int target, Compare comp) {
  auto it = numbers.begin();
  for (; it != numbers.end(); ++it) {
    if (!comp(*it, target)) {
      break;
    }
  }
  return it;
}

iter FindLessMax(const std::vector<int> & numbers, int target) {
  auto it = numbers.begin();
  for (; it != numbers.end(); ++it) {
    if (it + 1 != numbers.end() && !(*(it + 1) < target)) {
      break;
    }
  }
  return it;
}

iter FindLessEqualMax(const std::vector<int> & numbers, int target) {
  auto it = numbers.begin();
  for (; it != numbers.end(); ++it) {
    if (it + 1 != numbers.end() && !(*(it + 1) <= target)) {
      break;
    }
  }
  return it;
}

iter FindGreaterMin(const std::vector<int> & numbers, int target) {
  auto it = numbers.begin();
  for (; it != numbers.end(); ++it) {
    if (*it > target) {
      break;
    }
  }
  return it;
}

iter FindGreaterEqualMin(const std::vector<int> & numbers, int target) {
  auto it = numbers.begin();
  for (; it != numbers.end(); ++it) {
    if (*it >= target) {
      break;
    }
  }
  return it;
}

TEST(BsearchTest, findLessMax) {
  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);
  for (size_t i = 0; i < TEST_COUNT; ++i) {
    int target = GenARandomNumber();
    auto it_line = FindLessMax(numbers, target);
    auto it_bin  = std::lower_bound(numbers.begin(), numbers.end(), target);
    if (it_bin != numbers.begin() && it_bin != numbers.end()) {
      --it_bin;
    }
    ASSERT_TRUE(it_line == it_bin); 
  }
}

TEST(BsearchTest, findLessEqualMax) {
  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);
  for (size_t i = 0; i < TEST_COUNT; ++i) {
    int target = GenARandomNumber();
    auto it_line = FindLessEqualMax(numbers, target);
    auto it_bin  = std::lower_bound(numbers.begin(), numbers.end(), target, [](int val, int tar) {
      return val <= tar; 
    });
    if (it_bin != numbers.begin() && it_bin != numbers.end()) {
      --it_bin;
    }
    ASSERT_TRUE(it_line == it_bin); 
  }
}

TEST(BsearchTest, findGreaterMin) {
  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);
  for (size_t i = 0; i < TEST_COUNT; ++i) {
    int target = GenARandomNumber();
    auto it_line = FindGreaterMin(numbers, target);
    auto it_bin  = std::upper_bound(numbers.begin(), numbers.end(), target);
    ASSERT_TRUE(it_line == it_bin); 
  }
}

TEST(BsearchTest, findGreaterEqualMin) {
  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);
  for (size_t i = 0; i < TEST_COUNT; ++i) {
    int target = GenARandomNumber();
    auto it_line = FindGreaterEqualMin(numbers, target);
    auto it_bin  = std::upper_bound(numbers.begin(), numbers.end(), target, [](int tar, int val) {
      return tar <= val;
    });
    ASSERT_TRUE(it_line == it_bin); 
  }
}

int main(int argc, char** argv) {
  testing::InitGoogleTest(&argc, argv);  
  return RUN_ALL_TESTS();
}

如果你对我的文章内容有疑问，或者认为文章中有不正确的地方，欢迎留言。

参考文章：

https://www.cnblogs.com/ider/archive/2012/04/01/binary_search.html。

http://en.cppreference.com/w/cpp/algorithm/lower_bound

http://en.cppreference.com/w/cpp/algorithm/upper_bound