参考:http://cnblogs.com/onepixel/articles7674659.html
一、原理
任务:给定一个数组,将其中元素从小到大排列
方法:循环选择无序数组中最小的元素,将其放到有序数组的末尾位置
举例:数组 list = [11, 9, 5, 8, 1, 20],
从第一个元素开始,遍历无序数组,寻找最小元素,为1,
将1放到有序数组末尾位置,则有序数组变为[1],无序数组为[11, 9, 5, 8, 20 ];
从无序数组第一个元素开始,遍历寻找最小元素,为5,
将5放到有序数组末尾位置,则有序数组变为[1, 5],无序数组为[1, 5, 11, 9, 8, 20];
照此循环,最终得到有序数组[1, 5, 8, 9, 11, 20]。
二、 Python中代码实现
1 初始代码:
定义函数:
#selection sort function
#define the function
def select_sort(list):
n = len(list)
#go through the elements in the list
for i in range(n-1):
#given that the ith element is the minimun, compare it with the followed element
for j in range(i+1,n):
# if the jth element is smaller than the ith one, swap them
if list[j] < list[i]:
list[i], list[j] = list[j], list[i]
return list
调用函数:
import time
list = [11, 9, 5, 8, 1, 20]
t1 = time.time()
select_sort(list)
t2 = time.time()
print(list)
print("t", t2-t1, "s")
结果:
[1, 5, 8, 9, 11, 20] t 0.00020575523376464844 s
2 第一优化方案
原始算法每次寻找最小值时,进行了很多不必要的交换,每次循环我们只需要1次交换
解决方案,加入一个记录最小值索引的变量
#optimize the function
def select_sort_opt1(list):
n = len(list)
#count the number of swap
count = 0
#go through the elements in the list
for i in range(n-1):
#given that the ith element is the minimun,record the index i
min_index = i
#compare it with the followed element
for j in range(i+1,n):
if list[j] < list[i]:
# if the jth element is smaller than the ith one, change the index
min_index = j
if min_index != i:
list[i], list[min_index] = list[min_index], list[i]
count += 1
print("swapped ", count, "times")
return list
调用后结果:
swapped 2 times [1, 8, 5, 9, 11, 20] t 0.0012645721435546875 s
3 第二优化方案
上一步中我们加入最小值索引,我们还可以加入最大值索引,从而将循环次数缩短为一半
即将最小与最大值放在两边,两边同时向中间逼近
#optimize the function further
def select_sort_opt2(list):
n = len(list)
#go through the elements in the list
for i in range(n//2):
#given that the ith element is the minimun and the (n-i-1)th is the maximun
min_index = i
max_index = n-i-1
#compare the ith element with the followed element
for j in range(i+1,n):
if list[j] < list[i]:
# if the jth element is smaller than the ith one, change the index
min_index = j
if min_index != i:
list[i], list[min_index] = list[min_index], list[i]
for j in range(n-i-2, i, -1):
if list[j] > list[n-i-1]:
max_index = j
if max_index != n-i-1:
list[n-i-1], list[max_index] = list[max_index], list[n-i-1]
return list
调用结果:
[1, 8, 5, 9, 11, 20] t 0.0002200603485107422 s
4 优化对比
将三个函数放在一起执行,对比时间
import time
list = [11, 9, 5, 8, 1, 20]
list1 = [11, 9, 5, 8, 1, 20]
list2 = [11, 9, 5, 8, 1, 20]
t1 = time.time()
select_sort(list)
t2 = time.time()
print("t", t2-t1, "s")
t1 = time.time()
select_sort_opt1(list1)
t2 = time.time()
print("t", t2-t1, "s")
t1 = time.time()
select_sort_opt2(list2)
t2 = time.time()
print("t", t2-t1, "s")
print(list)
结果:
t 0.00020170211791992188 s t 0.00025010108947753906 s t 0.0002582073211669922 s [1, 5, 8, 9, 11, 20]
数据量较小,体现不出算法性能,使用昨天的生成随机数组的操作
import time
import numpy as np
#avoid opt_versions sort the sorted list
#expand the volume of data
list = np.random.randint(0, 10000, size = 10000)
list1 = np.random.randint(0, 10000, size = 10000)
list2 = np.random.randint(0, 10000, size = 10000)
time_start = time.time()
select_sort(list)
time_end = time.time()
print("t", time_end - time_start, "s")
time_start_1 = time.time()
select_sort_opt1(list1)
time_end_1 = time.time()
print("t1", time_end_1 - time_start_1, "s")
time_start_2 = time.time()
select_sort_opt2(list2)
time_end_2 = time.time()
print("t2", time_end_2 - time_start_2, "s")
所用时间:
t 16.561472415924072 s t1 11.342761754989624 s t2 14.73498249053955 s
三、总结与思考
从上述执行结果来看,就此数量级的数据排序问题而言,选择排序算法性能优于冒泡排序算法,所消耗资源也降低了。
此外,可以看出,第二种优化所用时间并不比第一种优化的少,这应该是他每次循环需要比较两次且交换两次造成的,算法的复杂程度提高了。