1. 二分查找原理
有序数列中寻找某一指定数x的索引位置。
以中位数为比较对象,如x大于中位数则在右半区继续寻找中位数对比,如果小于则在左半区
不断循环直至成功/失败
2. Python实现
递归实现
方法一:RUNOOB上的例子
def bs(l, item, left, right): # l为有序数列 item为目标数 left为左边界索引,right为右边界索引
if right >= left:
mid = int((right-left)/2) + left
if l[mid] == item:
return mid
elif l[mid] > item:
return bs(l, item, left, mid-1)
elif l[mid] < item:
return bs(l, item, mid+1, right)
else:
return 'item is not in this list'
方法二:自己写的
def bs(l, item, left): # l为有序数列(随迭代变化) item为目标数 left为左边界索引
if len(l) > 1:
mid = int(len(l)/2)
if l[mid] == item:
return left + mid
elif l[mid] < item:
return bs(l[mid+1:], item, left+mid+1)
elif l[mid] > item:
return bs(l[:mid], item, left)
elif len(l) == 1:
if l[0] == item:
return left
else:
return 'item is not in this list'
else:
return 'item is not in this list'
方法三:直接用index
def bs(l,item):
return l.index(item)
3. 性能分析
from timeit import Timer
pre_statement = '''
a = list(range(10))
b = 5
def bs_1(l, item, left, right): # l为有序数列 item为目标数 left为左边界索引,right为右边界索引
if right >= left:
mid = int((right-left)/2) + left
if l[mid] == item:
return mid
elif l[mid] > item:
return bs_1(l, item, left, mid-1)
elif l[mid] < item:
return bs_1(l, item, mid+1, right)
else:
return 'item is not in this list'
def bs(l, item, left): # l为有序数列(随迭代变化) item为目标数 left为左边界索引
if len(l) > 1:
mid = int(len(l)/2)
if l[mid] == item:
return left + mid
elif l[mid] < item:
return bs(l[mid+1:], item, left+mid+1)
elif l[mid] > item:
return bs(l[:mid], item, left)
elif len(l) == 1:
if l[0] == item:
return left
else:
return 'item is not in this list'
else:
return 'item is not in this list'
def bs_2(l,item):
return l.index(item)
'''
iter_statement1 = "bs_1(a,b,0,len(a)-1)"
iter_statement2 = "bs(a,b,0)"
iter_statement3 = "bs_2(a,b)"
print(Timer(iter_statement1, pre_statement).timeit())
print(Timer(iter_statement2, pre_statement).timeit())
print(Timer(iter_statement3, pre_statement).timeit())
结果发现对于不同的list和目标数,效率是截然不同的
当目标数在数列两端的时候方法三是明显要快很多的,但当目标数处于中间且数列规模很大的时候(比如>1000)的时候方法一是最优的
部分结果:
a = list(range(10))
b = 5
1.0508554549887776
0.35678495501633734
0.18040671001654118 #最优
a = list(range(10))
b = 1
0.6674665239988826
1.3116655169869773
0.14078226895071566 # 最优
a = list(range(1000))
b = 450
4.44339226197917 # 最优
6.785631975973956
5.672643766039982
a = list(range(1000))
b = 100
2.5528286850312725
6.406716412981041
1.5316121689975262 # 最优
结论
一般来说大部分情况用index即可,如果数列规模很大,且index很慢用方法一可能有奇效