215. Kth Largest Element in an Array
Problem Description:
Given an integer array nums and an integer k, return the kth largest element in the array.
Note that it is the kth largest element in the sorted order, not the kth distinct element.
Thoughts
A straight-forward thought: We can rank the array from small to large and get nums[-k] or from large to small and get nums[k-1]. It’s O(nlogn) if use python: nums.sort().
Think deeper:
We only want the kth largest. We don’t need to rank the whole list. Sort the whole list is a waste of time. Quick Select can help to find the kth largest or kth smallest number with O(n) time complexity.
Any exception or extreme cases?
nums array have duplicates - doesn’t affect the quick select algrithom because in each partition of the quickselect we split the nums into 2 parts: [all the elements that larger than or equal to pivot]
+ [all the elements that smllar than or equal to pivot]
. If there are multiple elements that are larger than pivot they will all be moved to the left part. In the extreme case, if there are all the same number, take 5 for example, the 2 parts will still be balanced because every left 5 will swap with the right 5 and hence the number of left part and the right part always keep the samle.
length of nums is less than k - no quilified result
there is only 1 number and we need to find the 1st largest - just return the number
Python Code
def findKthLargest(nums, k):
return quickselect(nums, 0, len(nums) - 1, k)
def quickselect(nums, start, end, k):
# if there is only 1 element remaining in the array: return it. It should be the last value we find after all the partitions.
if start == end:
return nums[start]
pivot = nums[(start + end) // 2]
left, right = start, end
# use <= here: if we use left < right then there will be cases that the left and the right pointer are pointing to the same position and then both partitions will include that position which is wasting time. And it's not correct because when you have 2 numbers, there will alwasy be a partition with the length 2.([1, 2] => [1], [1, 2]) then cause stackoverflow and infinite loop.
while left <= right:
# use > point and < pivot here: if the pivot is happen to be the smallest number in the array and if we use nums[left] >= pivot, the left pointer will be moved all the way until it's greater then right, which is the end + 1. the left will be out of index and the right will remain the same hence the 2 partition is [nums][]. There will be infinite loop and nothing changed.
while left <= right and nums[left] > pivot:
left += 1
while left <= right and nums[right] < pivot:
right -= 1
if left <= right:
nums[left], nums[right] = nums[right], nums[left]
left += 1
right -= 1
# after all the loop, we get the partition and the right is smaller than left now:[start,..., right][pivot][left,...,end]. There might be pivot in the middle and might not.
# So we need to check based on the position of left and right.Cannot compare based on a single right or left. Because they maybe not next to each other.There is a chance that we return the pivot whose index is right + 1.
# Also keep in mind that even right - start + 1 == k we cannot return nums[right], because [start,..., right] is not ranked. We only know that this k numbers are lager or equal to pivot but nums[right] may not be the smallest in this partition.
if left - start + 1 <= k:
return quickselect(nums, left, end, k - left + start)
if right - start + 1 >= k:
return quickselect(nums, start, right, k)
return nums[right + 1]
Easy to make mistake
if start == end:
return nums[start]
left <= right
nums[left] > pivot
nums[right] < pivot
if left - start + 1 <= k:
return quickselect(nums, left, end, k - left + start)
if right - start + 1 >= k:
return quickselect(nums, start, right, k)
return nums[right + 1]