题目:
Givena array of 10,000 random intergers, select the biggest 100 numbers.
1)The order of the result numbers does not matter;
2)Take care about the algorithm performance and big O complexity.
我的解答:
#coding=utf-8
## generate random numbers
from random import randint
# low and high limit of the numbers of the random number
low = -10000000
high = 10000000
# total_number of the numbers
total_number = 10000
# the number of beggest number we need
max_number = 100
# use () for [] will be more efficient ?
numbers = [randint(low,high) for elem in xrange(total_number)]
#print numbers
"""
when the dataset is not large, we still consider the method of "sort and then select"
take quick sort for example, its average complexity is O(N*logN),
then take the K beggest numbers with complexity O(K). So
time complexity is O(N*logN) + O(K) = O(N*logN) in total.
If we only find the biggest K numbers and let the N-K number alone,
the complexity is O(N*K) using part_sorting.
comparing O(N*logN) and O(N*k), we can find that quick sort algorithm is more efficient that the latter one
when k > logN and vice versa.
"""
from math import log
def quick_sort(numbers):
return [] if numbers == [] else quick_sort([y for y in numbers[1:] if y < numbers[0]]) + \
[numbers[0]] + quick_sort([y for y in numbers[1:] if y >= numbers[0]])
def selection_sort_part(numbers):
size = len(numbers)
# the range is [0...max_number-1] ,rather than [0,len(numbers)]. so the selection performs only K times,
for i in range(max_number):
k = i
for j in range(i + 1, size):
#
if numbers[j] > numbers[k]:
k = j
if k is not i:
numbers[i], numbers[k] = numbers[k], numbers[i]
return numbers
def main():
if max_number >= log(total_number,2):
print 'result from quick_sort algorithm:',quick_sort(numbers)[-max_number:]
else:
print 'result from selection_sort_part algorithm:',selection_sort_part(numbers)[:max_number]
if __name__ == '__main__':
main()