Given an array of integers citations
where citations[i]
is the number of citations a researcher received for their ith
paper and citations
is sorted in an ascending order, return compute the researcher's h
-index.
According to the definition of h-index on Wikipedia: A scientist has an index h
if h
of their n
papers have at least h
citations each, and the other n − h
papers have no more than h
citations each.
If there are several possible values for h
, the maximum one is taken as the h
-index.
You must write an algorithm that runs in logarithmic time.
Example 1:
Input: citations = [0,1,3,5,6] Output: 3 Explanation: [0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
Example 2:
Input: citations = [1,2,100] Output: 2
Constraints:
n == citations.length
1 <= n <= 105
0 <= citations[i] <= 1000
citations
is sorted in ascending order.
题目里又是有序数组又是O(logn)时间复杂度的,相当于直接告诉我们要用二分查找法。仔细分析题目可以把题目简化为:给定一个长度为n的有序数组citations
,从[1,n]范围内找一个最大数h,使得数组里至少有h个数的数值大于等于h。这样用二分查找法已经很明显且很容易了。
解题思路:把长度为n的数组citations从中点mid = (l + r) // 2 分成左右两个区间,中点数的值是citations[mid],由于数组是递增的,那么从mid开始到结尾都是大于等于中点数值citations[mid]的,个数为n - mid, 如果citations[mid]大于等于n-mid,那么说明存在一个h = n - mid使得数组里至少存在h个数的数值大于等于h,因此可以试图查找更长的h, 也就是继续在左半区间里查找。如果citations[mid]小于n-mid,左半区间的数值肯定都小于h=n-mid,则大于等于h = n - mid的个数肯定少于h个,因此需要在右半区间查找更短一点的h。
基于以上的思路就很容易实现二分查找的算法了。
class Solution:
def hIndex(self, citations: List[int]) -> int:
n = len(citations)
l, r = 0, n - 1
while l <= r:
mid = l + (r - l) // 2
h = n - mid
if citations[mid] >= h:
r = mid - 1
else:
l = mid + 1
return n - l