本文翻译自:Write a program to find 100 largest numbers out of an array of 1 billion numbers
I recently attended an interview where I was asked "write a program to find 100 largest numbers out of an array of 1 billion numbers." 我最近参加了一次采访,我被问到“编写一个程序,从10亿个数字中找出100个最大的数字。”
I was only able to give a brute force solution which was to sort the array in O(nlogn) time complexity and take the last 100 numbers. 我只能给出一个强力解决方案,即以O(nlogn)时间复杂度对数组进行排序并获取最后100个数字。
Arrays.sort(array);
The interviewer was looking for a better time complexity, I tried a couple of other solutions but failed to answer him. 面试官正在寻找更好的时间复杂性,我尝试了其他一些解决方案但未能回答他。 Is there a better time complexity solution? 有更好的时间复杂度解决方案吗?
#1楼
参考:https://stackoom.com/question/1Ig0A/编写一个程序-从-亿个数字的数组中找出-个最大的数字
#2楼
You can iterate over the numbers which takes O(n) 你可以迭代O(n)的数字
Whenever you find a value greater than the current minimum, add the new value to a circular queue with size 100. 只要找到大于当前最小值的值,就将新值添加到大小为100的循环队列中。
The min of that circular queue is your new comparison value. 该循环队列的最小值是您的新比较值。 Keep on adding to that queue. 继续添加到该队列。 If full, extract the minimum from the queue. 如果已满,请从队列中提取最小值。
#3楼
You can keep a priority queue of the 100 biggest numbers, iterate through the billion numbers, whenever you encounter a number greater than the smallest number in the queue (the head of the queue), remove the head of the queue and add the new number to the queue. 每当遇到大于队列中最小数字(队列头部)的数字时,您可以保留100个最大数字的优先级队列,遍历十亿个数字,删除队列的头部并添加新的数字到队列。
EDIT: as Dev noted, with a priority queue implemented with a heap, the complexity of insertion to queue is O(logN)
编辑:正如Dev所说,使用堆实现优先级队列,插入队列的复