二分查找总结

最新推荐文章于 2023-04-09 17:13:47 发布

诗酒趁年华sustech

最新推荐文章于 2023-04-09 17:13:47 发布

阅读量1.5k

点赞数 2

分类专栏：数据结构与算法分析文章标签：经验分享

原文链接：https://blog.csdn.net/sinat_30973431/article/details/103476360

版权

数据结构与算法分析专栏收录该内容

1 篇文章 0 订阅

订阅专栏

这篇文章写得太好了！特予转载！

原文链接：https://blog.csdn.net/sinat_30973431/article/details/103476360

一、概述

1、定义

二分查找是一种广泛使用的搜索算法，主要用于在有序数组（一般是升序，后面的内容也只是针对升序情况）上查找元素

2、主要思想

二分查找算法背后的主要思想是充分利用元素之间的有序性、以及数组的随机访问特性，每次都与查找区间的中间元素比较大小，根据比较结果不断将查找范围缩小一半（采用分治策略，剪枝？），直到元素被找到，或者，查找区间被缩小为0

3、工作流程

将中间元素与target进行比较

如果匹配，则返回true

如果不匹配，

如果中间元素比target大，则继续搜索左半部分；

如果中间元素比target小，则继续搜索右半部分

不断搜索，直到找到target或者查找区间变为0

4、伪代码

以下是二分查找的伪代码，当数组没有重复元素时，返回target的下标或者查找失败的标志；当数组有重复元素时，返回其中一个元素的下标，具体返回哪个取决于数组中数据的分布状况

5、性能分析

二分查找是一种高效的查找算法，当数据规模为n，由于每次迭代数据都会缩小为原来的一般，所以被查找区间的大小变化依次为：n, n/2, n/4, n/8, ..., n/2^k, ...。也就是按等比数列的形式缩小。由于当数据量大小为1时，停止迭代，所以，另n/2^k = 1，即可计算出总共迭代次数为k = logn。而时间复杂度无非就是循环次数，所以时间复杂度为O(logn)。

空间复杂度

比较次数最坏最好

二、使用场景与使用要点

1、什么场景下可以使用二分查找？

一般用于查找数组元素，并且数组在查找之前必须已经排好序（一般是升序）

这里的关键词有两个，一个是数组，数组是顺序存储结构，具有随机访问特性，另一个是有序，如果是无序数组，要查找一个元素就只能通过顺序遍历了

也就是说，二分查找的使用场景是有局限性的

2、什么场景下不能使用二分查找？

（1）数据是非顺序表结构存储的，不能使用二分查找

二分查找依赖顺序表结构（数组）。不能使用链表，因为二分查找需要按照下标随机访问元素，而链表根据下标访问元素的时间复杂度是O(n)，如果基于链表，那么二分查找算法的时间复杂度会变高

（2）如果一组数据是非静态的（需要频繁删除、插入），不能使用二分查找

在二分查找之前需要将数组排好序。因此，如果针对一个经常插入删除元素的数组使用二分查找，要么需要在每次插入删除时保证数据有序，要么在每次二分查找之前都先进行排序，而不管是哪种方式，维护有序的成本都是非常高的

（3）数据量太小，不适合二分查找

二分查找只有在数据量比较大的时候才有优势，如果数据量很小，用顺序遍历即可。

不过有个例外，如果数据之间的比较操作非常耗时，那么不管数据量大小，最好都使用二分查找。这是因为，二分查找的时间不仅仅取决于时间复杂度量级 log n（即，循环的次数），还受每次比较所花费时间的影响（每一轮循环，都需要进行一两次比较，在进行时间复杂度分析时我们往往忽略常数项，将二分查找的复杂度定义为O(logn)，但是分析实际场景下的性能时也不能忽略掉它）

如果数组中存储的都是长度超过300的字符串，那么比较两个字符串的操作就比较耗时了，此时比较次数的减少能大大提高性能，因此，我们要尽可能减少比较次数，用二分查找比顺序遍历更有优势

（4）数据量太大，不适合二分查找

二分查找底层依赖数组这种数据结构，而数组为了支持随机访问的特性，要求内存空间连续，对内存的要求比较苛刻。1GB大小的数组用数组存的话就需要1GB的连续内存空间，即便剩余的内存空间远大于1GB但是内存空间都是零散的，那这块内存也不能使用

3、编写二分查找算法的要点？

总的来说，编写一个二分查找算法大致需要考虑下面的东西

（1）查找区间：查找区间可以是闭区间也可以是半开半闭区间，但是最好还是使用闭区间（虽然 STL 中的二分查找用的是左闭右开区间）。初始化闭区间一般用的是两个“游标”low和high指向区间头部元素和尾部元素。注意，不同需求下查找区间可能不同。

（2）循环条件：循环条件可以用left < right，也可以使用left <= right，但是要注意两种方式在处理边界上是有所区别的

（3）中间元素的下标：在循环内部，我们需要计算中间元素的下标，计算的方式也有两种，一种是mid := floor(low + high / 2)，一种是mid := ceil(left + right / 2)，前者是向下取整，后者是向上取整，当区间元素个数为奇数时两者没有区别，当区间元素为偶数时，前者拿到的是中间两个元素的前一个，后者拿到的是中间元素的后一个。在含有重复元素的二分查找场景中，没有选择好中间元素下标的计算方式，很可能引发死循环。

int mid = left + (right - left) / 2 或 int mid = left + (right - left) >> 1 是向下取整的方式；

int mid = right - (right - left) / 2 或 int right = right - (right - left) >> 1 是向上取整的方式

不要使用int mid = (left + right) / 2来计算mid，因为left + right之后可能会溢出。另外，尽量使用位运算，因为计算机处理位运算比处理除法运算要快得多。

（4）left 和 right的更新

通常用到的更新方式有left = mid + 1 和 right = mid - 1，此外还有left = mid 和 right = mid，但是，在使用后面两种的时候要留意边界条件（主要是只剩两个元素的时候），否则非常容易出现死循环。

三、二分查找的实现

1、数组没有重复元素，查找target

/**
     *
     * @param nums      the array with no duplicates searched by us
     * @param target    the number we search
     * @return          the index of target in nums; -1 if target not exists
     */
    public static int binarySearchNoDuplicates(int[] nums, int target) {
        return binarySearchNoDuplicates(nums, 0, nums.length - 1, target);
    }

 /**
     *
     * @param nums      the array with no duplicates searched by us
     * @param left      the start index of search range, inclusive
     * @param right     the end index of search range, inclusive
     * @param target    the number we search
     * @return          the index of target in nums; -1 if target not exists
     */
    public static int binarySearchNoDuplicates(int[] nums, int left, int right, int target) {
        if (left > right) {
            return -1;
        }
 
        // left <= right
        while (left < right) {
            //mid := ceil((left + right) / 2)
            //int mid = right - (right - left) / 2;
            //
            //mid := floor((left + right) / 2)
            int mid = (right - left) / 2 + left;
 
            if (nums[mid] < target) {
                left = mid + 1;
            } else if (nums[mid] > target) {
                right = mid - 1;
            } else {
                return mid;
            }
        }
 
        //left == right
        // if we use the ceil((left + right) / 2) to calculate mid, we need to consider OutOfBound
        /*
        if (left < nums.length && nums[left] == target) {
            return left;
        }
         */
        //
        if (nums[left] == target) {
            return left;
        }
 
        return -1;
    }

上面的循环条件用的是left < right，当然也可以用left <= right，注意边界的处理即可

 /**
     *
     * @param nums      the array with no duplicates searched by us
     * @param left      the start index of search range, inclusive
     * @param right     the end index of search range, inclusive
     * @param target    the number we search
     * @return          the index of target in nums; -1 if target not exists
     */
    public static int binarySearchNoDuplicates2(int[] nums, int left, int right, int target) {
        if (left > right) {
            return -1;
        }
 
        // left <= right
        while (left <= right) {
            //mid := floor((left + right) / 2)
            int mid = (right - left) / 2 + left;
 
            if (nums[mid] < target) {
                left = mid + 1;
            } else if (nums[mid] > target) {
                right = mid - 1;
            } else {
                return mid;
            }
        }
 
        //left > right
        return -1;
    }

递归实现：

 /**
     *
     * @param nums      the array with no duplicates searched by us
     * @param left      the start index of search range, inclusive
     * @param right     the end index of search range, inclusive
     * @param target    the number we search
     * @return          the index of target in nums; -1 if target not exists
     */
    //递归实现
    public static int binarySearchNoDuplicates3(int[] nums, int left, int right, int target) {
        if (left > right) {
            return -1;
        }
 
        int mid = left + (right - left) / 2;
 
        if (nums[mid] == target) {
            return mid;
        } else if (nums[mid] < target) {
            return binarySearchNoDuplicates3(nums, mid + 1, right, target);
        } else {
            return binarySearchNoDuplicates3(nums, left, mid - 1, target);
        }
    }

2、数组含有重复元素，查找第一个target



   /**
     *
     * @param nums      the array with duplicates searched by us
     * @param target    the number we search
     * @return          the index of target in nums; -1 if target not exists
     */
    public static int binarySearch(int[] nums, int target) {
        return lowBound(nums, target);
    }
 
    /**
     *
     * @param nums      the array with duplicates searched by us
     * @param target    the number we search
     * @return          the first index that target occur; -1 if there is no target in nums
     */
    public static int lowBound(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int left = 0, right = nums.length - 1;
        while (left < right) {
            //mid := floor((left + right) / 2)
            //here we cannot use ceil((left + right) / 2), because when the remain size is two && nums[mid] >= target, then mid always == right, it will be an endless loop
            int mid = (right - left) / 2 + left;
 
            if (nums[mid] >= target) {
                //nums[mid] > target;  nums[mid] == target && mid is between the first target and last target
                right = mid;        //No move beyond mid
            } else {
                //nums[mid] < target
                left = mid + 1;     //Move beyond mid
            }
        }
 
        //left == right
        if (nums[left] == target) {
            //there is one or more than one target
            return left;
        }
        //there is no target
        return -1;
    }

上面这种方式，由于right的更新使用的是right = mid; 没有移动到超过mid，因此，计算mid的时候要使用向下取整的方式

floor（（left + right）/2），如果使用ceil（（left + right）/2），则可能出现死循环。比如，当只剩两个元素3， 3，left指向前一个，right指向后一个，并且target为3时，mid = ceil（（left+right）/2） = right，由于nums[mid] = 3 >= target，所以right保持不动，继续循环。由于每次循环计算到的mid仍然是right，所以right会一直保持不动，造成死循环。

更好的实现方式：

  /**
     *
     * @param nums      the array with duplicates searched by us
     * @param target    the number we search
     * @return          the first index that target occur; -1 if there is no target in nums
     */
    public static int lowBound2(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int low = 0, high = nums.length - 1;
 
        while (low <= high) {
            int mid = low + ((high - low) >> 1);
 
            if (nums[mid] > target) {
                high = mid - 1;
            } else if (nums[mid] < target) {
                low = mid + 1;
            } else {
                if (mid == 0 || nums[mid - 1] != target) {
                    //mid == 0;   mid != 0 && nums[mid - 1] != target
                    return mid;
                } else {
                    //although its value is target, it is not the first one we want.
                    high = mid - 1;
                }
            }
        }
 
        return -1;
    }

3、数组含有重复元素，查找最后一个target

 /**
     *
     * @param nums      the array with duplicates searched by us
     * @param target    the number we search
     * @return          the last index that target occur; -1 if target not exists in nums
     */
    public static int highBound(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int left = 0, right = nums.length - 1;
        while (left < right) {
            //mid := ceil((left + right) / 2)
            //here we cannot use floor((left + right) / 2), because when the remain size is two && nums[mid] <= target, then mid always == left, it will be an endless loop
            int mid = right - (right - left) / 2;
 
            if (nums[mid] <= target) {
                //nums[mid] < target;  nums[mid] == target && mid is between the first target and last target
                left = mid;        //Not move beyond mid
            } else {
                //nums[mid] > target
                right = mid - 1;     //Move beyond mid
            }
        }
 
        //left == right
        if (nums[left] == target) {
            //there is one or more than one target
            return left;
        }
        //there is no target
        return -1;
    }

与上面相反，这里计算mid的方式应该是ceil（（left + target）/2），原因同上。

更好的实现方式：

  /**
     *
     * @param nums      the array with duplicates searched by us
     * @param target    the number we search
     * @return          the last index that target occur; -1 if target not exists in nums
     */
    public static int highBound2(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int low = 0, high = nums.length - 1;
 
        while (low <= high) {
            int mid = low + ((high - low) >> 1);
 
            if (nums[mid] > target) {
                high = mid - 1;
            } else if (nums[mid] < target) {
                low = mid + 1;
            } else {
                if (mid == nums.length - 1 || nums[mid + 1] != target) {
                    //mid == nums.length - 1; mid != nums.length - 1 && nums[mid + 1] != target
                    return mid;
                }
 
                //although its value is target, it is not the last one
                low = mid + 1;
            }
        }
 
        return -1;
    }
4、数组含有重复元素，查找最后一个小于等于target的元素

    /**
     * 
     * @param nums      the array we search the last one which <= target
     * @param target    the number used to search
     * @return          the index of last one which <= target; -1 if there is no element <= target
     */
    public static int lastSmallerOrEquals1(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int left = 0, right = nums.length - 1;
        while (left < right) {
            //mid := floor((left + right)/2)
            //int mid = left + (right - left) / 2;
            //here use floor() will lead to endless loop!    eg, nums=[1, 10, 23]  target=13
            //
            //mid := ceil((left + right)/2)
            int mid = right - (right - left) / 2;
 
            if (nums[mid] <= target) {
                left = mid;
            } else {
                right = mid - 1;
            }
        }
 
        //left == right
        if (nums[left] <= target) {
            //there is one or more which <= target
            return left;
        }
        //there is no elements which <= target
        return -1;
    }

注意这里要使用ceil来计算mid，否则可能导致死循环。

更好的方式：

/**
     *
     * @param nums      the array we search the last one which <= target
     * @param target    the number used to search
     * @return          the index of last one which <= target; -1 if there is no element <= target
     */
    public static int lastSmallerOrEquals(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int left = 0, right = nums.length - 1;
 
        while (left <= right) {
            //mid := floor((high + low)/2)
            int mid = left + ((right - left) >> 1);
 
            if (nums[mid] <= target) {
                if (mid == nums.length - 1 || nums[mid + 1] > target) {
                    return mid;
                }
                //although nums[mid] <= target, it is not the last one
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }
 
        return -1;
    }

5、数组含有重复元素，查找第一个大于等于target的元素

/**
     *
     * @param nums      the array we search the first one which >= target
     * @param target    the number used to search
     * @return          the index of first one which >= target; -1 if there is no element >= target
     */
    public static int firstLargerOrEquals(int[] nums, int target) {
        if (nums == null || nums.length == 0) {
            return -1;
        }
 
        int low = 0, high = nums.length - 1;
 
        while (low <= high) {
            //mid := floor((high + low)/2)
            int mid = low + ((high - low) >> 1);
 
            if (nums[mid] >= target) {
                if (mid == 0 || nums[mid - 1] < target) {
                    return mid;
                }
                //although nums[mid] >= target, it is not the first one which >= target
                high = mid - 1;
            } else {
                low = mid + 1;
            }
        }
 
        return -1;

}

————————————————

原文链接：https://blog.csdn.net/sinat_30973431/article/details/103476360

诗酒趁年华sustech

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
二分查找总结

这篇文章写得太好了！特予转载！原文链接：https://blog.csdn.net/sinat_30973431/article/details/103476360一、概述1、定义二分查找是一种广泛使用的搜索算法，主要用于在有序数组（一般是升序，后面的内容也只是针对升序情况）上查找元素2、主要思想二分查找算法背后的主要思想是充分利用元素之间的有序性、以及数组的随机访问特性，每次都与查找区间的中间元素比较大小，根据比较结果不断将查找范围缩小一半（采用分治策略，剪枝？），直到元素被找到，
复制链接

扫一扫