题目:Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher’s h-index.
According to the definition of h-index on Wikipedia: “A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each.”
For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.
Note: If there are several possible values for h, the maximum one is taken as the h-index.
本题是要求一个科学家的H_Index。也就是论文索引的一个指标。有h_index的定义可知,其值一定在0~citations范围内。所以我们可以使用一个数组来保存论文的索引次数。代码入下,可以击败58%的用户。
public static int hIndex(int[] citations) {
int n = citations.length, tot=0;
//arr用来保存每个索引次数论文的数量。arr[0]就是索引次数为0的论文数
//其长度比citations大一,是因为arr[n]用来保存索引次数大于n的论文数,因为索引次数大于n的文章一定是满足h_index的文章。
int[] arr = new int[n+1];
for (int i=0; i<n; i++) {
//遍历citations数组,并将论文的索引信息保存到arr数组中
if (citations[i]>=n) arr[n]++;
else arr[citations[i]]++;
}
//为了求最大的h_index,所以倒叙遍历arr,直到找到满足条件的索引i并返回。
for (int i=n; i>=0; i--) {
tot += arr[i];
if (tot>=i) return i;
}
return 0;
}
当然除此之外也可以先把数组进行排序,但相比上面方法的两次循环而言效率反倒有所下降,主要是因为排序算法的效率是o(nlogn),而上面代码的效率是o(n)。代码入下:
public int hIndex(int[] citations) {
if (citations == null || citations.length == 0) return 0;
Arrays.sort(citations);
int len = citations.length;
for (int i = 0; i < citations.length; i++) {
if (len <= citations[i])
return len;
else
len--;
}
return len;
}
此外,还可以采用排序之后在进行二分搜索的方式,代码入下:
public int hIndex(int[] citations) {
Arrays.sort(citations);
int n = citations.length;
int i = 0, j = n - 1;
while (i <= j) {
int k = (i + j) / 2;
int v = citations[k];
int h = n - k;
if (v >= h) {
j = k - 1;
} else {
i = k + 1;
}
}
return n - j - 1;
}