参考http://stackabuse.com/bubble-sort-in-python/
一 原理
任务:给定一个数组,将其中元素从小到大排列
方法:比较相邻两个元素的大小,若前一个比后一个大则交换位置,否则不交换,循环往复
举例:数组 list = [11, 9, 5, 8, 1, 20],
先比较11与9,11>9,交换位置,则数组变为[9, 11, 5, 8, 1, 20],
再比较11与5,11>5,交换位置,则数组变为[9, 5, 11, 8, 1, 20],
照这样比较下去,第一次循环后得到[9, 5, 8, 1, 11, 20],即将11传递到倒数第二个位置;
因此,第二次循环后得到[5, 8, 1, 9, 11, 20],将9传递到倒数第三个位置;
经过有限次循环后,得到最终结果[1, 5, 8, 9, 11, 20]。
二 Python中代码实现
1、未经过优化的结果
定义函数:
#bubble sort function
#define the function
def bubble_sort(list):
#go through the whole list
for i in range(len(list)):
#the last two element are list[n-2] and list[n-1]
for j in range(len(list)-1):
if list[j] > list[j+1]:
#swap them
list[j], list[j+1] = list[j+1], list[j]
调用函数:
list = [11,9,5,8,1,20]
bubble_sort(list)
print(list)
输出结果:
[1, 5, 8, 9, 11, 20]
为方便函数性能比较,调用时增加计时器:
import time
list = [11,9,5,8,1,20]
time_start = time.time()
bubble_sort(list)
time_end = time.time()
print("t", time_end - time_start, "s")
print(list)
输出结果:
t 0.00020813941955566406 s
[1, 5, 8, 9, 11, 20]
2 优化第一方案
为了避免已排序的元素再参与比较交换,添加一个flag
#bubble sort function optimization 1st
#define the function
def bubble_sort_opt1(list):
# avoid going through the whole list
#pass through without swap the element
#add a flag
has_swapped = True
while(has_swapped):
#change the flag
has_swapped = False
for i in range(len(list)-1):
if list[i] > list[i+1]:
#swap them
list[i], list[i+1] = list[i+1], list[i]
has_swapped = True
输出结果(包括时间):
t 0.00021767616271972656 s
[1, 5, 8, 9, 11, 20]
2 优化第二方案
在第一方案的基础上加入记录迭代次数的参数,控制每次循环的迭代次数
#bubble sort function optimization 2nd
#define the function
def bubble_sort_opt2(list):
# avoid going through the whole list
#pass through without swap the element
#add a flag
has_swapped = True
#record the number of iteration0
num_of_iterations = 0
while(has_swapped):
has_swapped = False
for i in range(len(list)-num_of_iterations-1):
if list[i] > list[i+1]:
list[i], list[i+1] = list[i+1], list[i]
has_swapped = True
num_of_iterations += 1
输出结果(包括时间):
t 0.0002117156982421875 s
[1, 5, 8, 9, 11, 20]
3 优化第三方案
在第一方案的基础上加入记录每次循环的最后交换的索引和未排序的索引边界
#bubble sort function optimization 3rd
#define the function
def bubble_sort_opt3(list):
#acquire the length of list
n = len(list)
#record the position of the last swap
last_Exchande_Index = 0
#acquire the border of the unsorted list
sort_Border = n-1
for i in range(n):
#add a flag
flag = True
for j in range(0, sort_Border):
if list[j] > list[j+1]:
#swap
list[j], list[j+1] = list[j+1], list[j]
#change the flag
flag = False
last_Exchande_Index = j
sort_Border = last_Exchande_Index
if flag:
break
return list
输出结果(包括时间):
t 0.00020813941955566406 s
[1, 5, 8, 9, 11, 20]
三 问题与反思
问题1,以上四种算法各自实现,所用时间对比差距不大
解决:
import time
#avoid opt_versions sort the sorted list
list = [11,9,5,8,1,20]
list1 = [11,9,5,8,1,20]
list2 = [11,9,5,8,1,20]
list3 = [11,9,5,8,1,20]
time_start = time.time()
bubble_sort(list)
time_end = time.time()
print("t", time_end - time_start, "s")
time_start_1 = time.time()
bubble_sort_opt1(list1)
time_end_1 = time.time()
print("t1", time_end_1 - time_start_1, "s")
time_start_2 = time.time()
bubble_sort_opt2(list2)
time_end_2 = time.time()
print("t2", time_end_2 - time_start_2, "s")
time_start_3 = time.time()
bubble_sort_opt3(list3)
time_end_3 = time.time()
print("t3", time_end_3 - time_start_3, "s")
print(list)
结果:
t 5.4836273193359375e-05 s
t1 4.982948303222656e-05 s
t2 4.220008850097656e-05 s
t3 8.20159912109375e-05 s
[1, 5, 8, 9, 11, 20]
分析:
显然,对于此数组的排序,优化第一、第二方案较好,但是数组中只有6个元素,有点小,所用时间的差异也是比较小的
问题2,数组过小,性能差异体现不出
解决:
import time
import numpy as np
#avoid opt_versions sort the sorted list
#expand the volume of data
list = np.random.randint(0, 10000, size = 10000)
list1 = np.random.randint(0, 10000, size = 10000)
list2 = np.random.randint(0, 10000, size = 10000)
list3 = np.random.randint(0, 10000, size = 10000)
time_start = time.time()
bubble_sort(list)
time_end = time.time()
print("t", time_end - time_start, "s")
time_start_1 = time.time()
bubble_sort_opt1(list1)
time_end_1 = time.time()
print("t1", time_end_1 - time_start_1, "s")
time_start_2 = time.time()
bubble_sort_opt2(list2)
time_end_2 = time.time()
print("t2", time_end_2 - time_start_2, "s")
time_start_3 = time.time()
bubble_sort_opt3(list3)
time_end_3 = time.time()
print("t3", time_end_3 - time_start_3, "s")
结果:
t 28.813856840133667 s
t1 28.481967210769653 s
t2 18.62924838066101 s
t3 18.53864026069641 s
分析:
显然,在10k个元素的排序任务中,最后两种优化方案的性能远比前两种好,提高了近33%;由此推断,在更高数量级的排序中,后两种实现方式更优。