这篇文章写得太好了!特予转载!
原文链接:https://blog.csdn.net/sinat_30973431/article/details/103476360
一、概述
1、定义
二分查找是一种广泛使用的搜索算法,主要用于在有序数组(一般是升序,后面的内容也只是针对升序情况)上查找元素
2、主要思想
二分查找算法背后的主要思想是充分利用元素之间的有序性、以及数组的随机访问特性,每次都与查找区间的中间元素比较大小,根据比较结果不断将查找范围缩小一半(采用分治策略,剪枝?),直到元素被找到,或者,查找区间被缩小为0
3、工作流程
将中间元素与target进行比较
如果匹配,则返回true
如果不匹配,
如果中间元素比target大,则继续搜索左半部分;
如果中间元素比target小,则继续搜索右半部分
不断搜索,直到找到target或者查找区间变为0
4、伪代码
以下是二分查找的伪代码,当数组没有重复元素时,返回target的下标或者查找失败的标志;当数组有重复元素时,返回其中一个元素的下标,具体返回哪个取决于数组中数据的分布状况
5、性能分析
二分查找是一种高效的查找算法,当数据规模为n,由于每次迭代数据都会缩小为原来的一般,所以被查找区间的大小变化依次为:n, n/2, n/4, n/8, ..., n/2^k, ...。也就是按等比数列的形式缩小。由于当数据量大小为1时,停止迭代,所以,另n/2^k = 1,即可计算出总共迭代次数为k = logn。而时间复杂度无非就是循环次数,所以时间复杂度为O(logn)。
空间复杂度
比较次数 最坏最好
二、使用场景与使用要点
1、什么场景下可以使用二分查找?
一般用于查找数组元素,并且数组在查找之前必须已经排好序(一般是升序)
这里的关键词有两个,一个是数组,数组是顺序存储结构,具有随机访问特性,另一个是有序,如果是无序数组,要查找一个元素就只能通过顺序遍历了
也就是说,二分查找的使用场景是有局限性的
2、什么场景下不能使用二分查找?
(1)数据是非顺序表结构存储的,不能使用二分查找
二分查找依赖顺序表结构(数组)。不能使用链表,因为二分查找需要按照下标随机访问元素,而链表根据下标访问元素的时间复杂度是O(n),如果基于链表,那么二分查找算法的时间复杂度会变高
(2)如果一组数据是非静态的(需要频繁删除、插入),不能使用二分查找
在二分查找之前需要将数组排好序。因此,如果针对一个经常插入删除元素的数组使用二分查找,要么需要在每次插入删除时保证数据有序,要么在每次二分查找之前都先进行排序,而不管是哪种方式,维护有序的成本都是非常高的
(3)数据量太小,不适合二分查找
二分查找只有在数据量比较大的时候才有优势,如果数据量很小,用顺序遍历即可。
不过有个例外,如果数据之间的比较操作非常耗时,那么不管数据量大小,最好都使用二分查找。这是因为,二分查找的时间不仅仅取决于时间复杂度量级 log n(即,循环的次数),还受每次比较所花费时间的影响(每一轮循环,都需要进行一两次比较,在进行时间复杂度分析时我们往往忽略常数项,将二分查找的复杂度定义为O(logn),但是分析实际场景下的性能时也不能忽略掉它)
如果数组中存储的都是长度超过300的字符串,那么比较两个字符串的操作就比较耗时了,此时比较次数的减少能大大提高性能,因此,我们要尽可能减少比较次数,用二分查找比顺序遍历更有优势
(4)数据量太大,不适合二分查找
二分查找底层依赖数组这种数据结构,而数组为了支持随机访问的特性,要求内存空间连续,对内存的要求比较苛刻。1GB大小的数组用数组存的话就需要1GB的连续内存空间,即便剩余的内存空间远大于1GB但是内存空间都是零散的,那这块内存也不能使用
3、编写二分查找算法的要点?
总的来说,编写一个二分查找算法大致需要考虑下面的东西
(1)查找区间:查找区间可以是闭区间也可以是半开半闭区间,但是最好还是使用闭区间(虽然 STL 中的二分查找用的是左闭右开区间)。初始化闭区间一般用的是两个“游标”low和high指向区间头部元素和尾部元素。注意,不同需求下查找区间可能不同。
(2)循环条件:循环条件可以用left < right,也可以使用left <= right,但是要注意两种方式在处理边界上是有所区别的
(3)中间元素的下标:在循环内部,我们需要计算中间元素的下标,计算的方式也有两种,一种是mid := floor(low + high / 2),一种是mid := ceil(left + right / 2), 前者是向下取整,后者是向上取整,当区间元素个数为奇数时两者没有区别,当区间元素为偶数时,前者拿到的是中间两个元素的前一个,后者拿到的是中间元素的后一个。在含有重复元素的二分查找场景中,没有选择好中间元素下标的计算方式,很可能引发死循环。
int mid = left + (right - left) / 2 或 int mid = left + (right - left) >> 1 是向下取整的方式;
int mid = right - (right - left) / 2 或 int right = right - (right - left) >> 1 是向上取整的方式
不要使用int mid = (left + right) / 2来计算mid,因为left + right之后可能会溢出。另外,尽量使用位运算,因为计算机处理位运算比处理除法运算要快得多。
(4)left 和 right的更新
通常用到的更新方式有left = mid + 1 和 right = mid - 1,此外还有left = mid 和 right = mid,但是,在使用后面两种的时候要留意边界条件(主要是只剩两个元素的时候),否则非常容易出现死循环。
三、二分查找的实现
1、数组没有重复元素,查找target
/**
*
* @param nums the array with no duplicates searched by us
* @param target the number we search
* @return the index of target in nums; -1 if target not exists
*/
public static int binarySearchNoDuplicates(int[] nums, int target) {
return binarySearchNoDuplicates(nums, 0, nums.length - 1, target);
}
/**
*
* @param nums the array with no duplicates searched by us
* @param left the start index of search range, inclusive
* @param right the end index of search range, inclusive
* @param target the number we search
* @return the index of target in nums; -1 if target not exists
*/
public static int binarySearchNoDuplicates(int[] nums, int left, int right, int target) {
if (left > right) {
return -1;
}
// left <= right
while (left < right) {
//mid := ceil((left + right) / 2)
//int mid = right - (right - left) / 2;
//
//mid := floor((left + right) / 2)
int mid = (right - left) / 2 + left;
if (nums[mid] < target) {
left = mid + 1;
} else if (nums[mid] > target) {
right = mid - 1;
} else {
return mid;
}
}
//left == right
// if we use the ceil((left + right) / 2) to calculate mid, we need to consider OutOfBound
/*
if (left < nums.length && nums[left] == target) {
return left;
}
*/
//
if (nums[left] == target) {
return left;
}
return -1;
}
上面的循环条件用的是left < right,当然也可以用left <= right,注意边界的处理即可
/**
*
* @param nums the array with no duplicates searched by us
* @param left the start index of search range, inclusive
* @param right the end index of search range, inclusive
* @param target the number we search
* @return the index of target in nums; -1 if target not exists
*/
public static int binarySearchNoDuplicates2(int[] nums, int left, int right, int target) {
if (left > right) {
return -1;
}
// left <= right
while (left <= right) {
//mid := floor((left + right) / 2)
int mid = (right - left) / 2 + left;
if (nums[mid] < target) {
left = mid + 1;
} else if (nums[mid] > target) {
right = mid - 1;
} else {
return mid;
}
}
//left > right
return -1;
}
递归实现:
/**
*
* @param nums the array with no duplicates searched by us
* @param left the start index of search range, inclusive
* @param right the end index of search range, inclusive
* @param target the number we search
* @return the index of target in nums; -1 if target not exists
*/
//递归实现
public static int binarySearchNoDuplicates3(int[] nums, int left, int right, int target) {
if (left > right) {
return -1;
}
int mid = left + (right - left) / 2;
if (nums[mid] == target) {
return mid;
} else if (nums[mid] < target) {
return binarySearchNoDuplicates3(nums, mid + 1, right, target);
} else {
return binarySearchNoDuplicates3(nums, left, mid - 1, target);
}
}
2、数组含有重复元素,查找第一个target
/**
*
* @param nums the array with duplicates searched by us
* @param target the number we search
* @return the index of target in nums; -1 if target not exists
*/
public static int binarySearch(int[] nums, int target) {
return lowBound(nums, target);
}
/**
*
* @param nums the array with duplicates searched by us
* @param target the number we search
* @return the first index that target occur; -1 if there is no target in nums
*/
public static int lowBound(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int left = 0, right = nums.length - 1;
while (left < right) {
//mid := floor((left + right) / 2)
//here we cannot use ceil((left + right) / 2), because when the remain size is two && nums[mid] >= target, then mid always == right, it will be an endless loop
int mid = (right - left) / 2 + left;
if (nums[mid] >= target) {
//nums[mid] > target; nums[mid] == target && mid is between the first target and last target
right = mid; //No move beyond mid
} else {
//nums[mid] < target
left = mid + 1; //Move beyond mid
}
}
//left == right
if (nums[left] == target) {
//there is one or more than one target
return left;
}
//there is no target
return -1;
}
上面这种方式,由于right的更新使用的是right = mid; 没有移动到超过mid,因此,计算mid的时候要使用向下取整的方式
floor((left + right)/2),如果使用ceil((left + right)/2),则可能出现死循环。比如,当只剩两个元素3, 3,left指向前一个,right指向后一个,并且target为3时,mid = ceil((left+right)/2) = right,由于nums[mid] = 3 >= target,所以right保持不动,继续循环。由于每次循环计算到的mid仍然是right,所以right会一直保持不动,造成死循环。
更好的实现方式:
/**
*
* @param nums the array with duplicates searched by us
* @param target the number we search
* @return the first index that target occur; -1 if there is no target in nums
*/
public static int lowBound2(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int low = 0, high = nums.length - 1;
while (low <= high) {
int mid = low + ((high - low) >> 1);
if (nums[mid] > target) {
high = mid - 1;
} else if (nums[mid] < target) {
low = mid + 1;
} else {
if (mid == 0 || nums[mid - 1] != target) {
//mid == 0; mid != 0 && nums[mid - 1] != target
return mid;
} else {
//although its value is target, it is not the first one we want.
high = mid - 1;
}
}
}
return -1;
}
3、数组含有重复元素,查找最后一个target
/**
*
* @param nums the array with duplicates searched by us
* @param target the number we search
* @return the last index that target occur; -1 if target not exists in nums
*/
public static int highBound(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int left = 0, right = nums.length - 1;
while (left < right) {
//mid := ceil((left + right) / 2)
//here we cannot use floor((left + right) / 2), because when the remain size is two && nums[mid] <= target, then mid always == left, it will be an endless loop
int mid = right - (right - left) / 2;
if (nums[mid] <= target) {
//nums[mid] < target; nums[mid] == target && mid is between the first target and last target
left = mid; //Not move beyond mid
} else {
//nums[mid] > target
right = mid - 1; //Move beyond mid
}
}
//left == right
if (nums[left] == target) {
//there is one or more than one target
return left;
}
//there is no target
return -1;
}
与上面相反,这里计算mid的方式应该是ceil((left + target)/2),原因同上。
更好的实现方式:
/**
*
* @param nums the array with duplicates searched by us
* @param target the number we search
* @return the last index that target occur; -1 if target not exists in nums
*/
public static int highBound2(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int low = 0, high = nums.length - 1;
while (low <= high) {
int mid = low + ((high - low) >> 1);
if (nums[mid] > target) {
high = mid - 1;
} else if (nums[mid] < target) {
low = mid + 1;
} else {
if (mid == nums.length - 1 || nums[mid + 1] != target) {
//mid == nums.length - 1; mid != nums.length - 1 && nums[mid + 1] != target
return mid;
}
//although its value is target, it is not the last one
low = mid + 1;
}
}
return -1;
}
4、数组含有重复元素,查找最后一个小于等于target的元素
/**
*
* @param nums the array we search the last one which <= target
* @param target the number used to search
* @return the index of last one which <= target; -1 if there is no element <= target
*/
public static int lastSmallerOrEquals1(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int left = 0, right = nums.length - 1;
while (left < right) {
//mid := floor((left + right)/2)
//int mid = left + (right - left) / 2;
//here use floor() will lead to endless loop! eg, nums=[1, 10, 23] target=13
//
//mid := ceil((left + right)/2)
int mid = right - (right - left) / 2;
if (nums[mid] <= target) {
left = mid;
} else {
right = mid - 1;
}
}
//left == right
if (nums[left] <= target) {
//there is one or more which <= target
return left;
}
//there is no elements which <= target
return -1;
}
注意这里要使用ceil来计算mid,否则可能导致死循环。
更好的方式:
/**
*
* @param nums the array we search the last one which <= target
* @param target the number used to search
* @return the index of last one which <= target; -1 if there is no element <= target
*/
public static int lastSmallerOrEquals(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int left = 0, right = nums.length - 1;
while (left <= right) {
//mid := floor((high + low)/2)
int mid = left + ((right - left) >> 1);
if (nums[mid] <= target) {
if (mid == nums.length - 1 || nums[mid + 1] > target) {
return mid;
}
//although nums[mid] <= target, it is not the last one
left = mid + 1;
} else {
right = mid - 1;
}
}
return -1;
}
5、数组含有重复元素,查找第一个大于等于target的元素
/**
*
* @param nums the array we search the first one which >= target
* @param target the number used to search
* @return the index of first one which >= target; -1 if there is no element >= target
*/
public static int firstLargerOrEquals(int[] nums, int target) {
if (nums == null || nums.length == 0) {
return -1;
}
int low = 0, high = nums.length - 1;
while (low <= high) {
//mid := floor((high + low)/2)
int mid = low + ((high - low) >> 1);
if (nums[mid] >= target) {
if (mid == 0 || nums[mid - 1] < target) {
return mid;
}
//although nums[mid] >= target, it is not the first one which >= target
high = mid - 1;
} else {
low = mid + 1;
}
}
return -1;
}
————————————————
原文链接:https://blog.csdn.net/sinat_30973431/article/details/103476360