问题描述
https://leetcode.com/problems/h-index/description/
Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher’s h-index.
According to the definition of h-index on Wikipedia: “A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each.”
For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.
Note: If there are several possible values for h, the maximum one is taken as the h-index.
Credits:
Special thanks to @jianchao.li.fighter for adding this problem and creating all test cases.
问题分析
一个作者有n篇文章,找出h篇文章,这些文章的引用数至少是 h。求出这个h最大的数。给定的参数是每篇文章的引用数。
解法一
将所有的引用数升序排序,然后从引用数最大的开始遍历,用引用数和当前统计过的文章数进行比较。当引用数大于文章数的时候,继续加一篇文章,之前的文章数量就是我们要找的 h片文章的引用数至少是 h的数量。
代码实现:
public int hIndex(int[] citations) {
if (citations == null || citations.length == 0) {
return 0;
}
Arrays.sort(citations);
int maxH = 0;
int hSize = citations.length - 1;
while (maxH <= hSize) {
if (citations[hSize - maxH] > maxH) {
maxH++;
} else {
break;
}
}
return maxH;
}
这个算法时间复杂度是O(nlongn),空间复杂度;
统计文章的引用数
以应用数为下标来统计每一个引用数文章的数量。由于需要结果是 引用数大约文章数的最大的数,大于文章数量的引用数也记为文章引用数。处理后的数据是下标是引用数量,数组中的值是文章的数量。
第二步处理数据:从最大引用数开始进行统计文章的数量。当文章数小于引用数,继续加。如果遇到第一个文章数大于等于引用数,就直接返回引用的数。这个值当中引用数是最关键的。
代码如下:
public int hIndex(int[] citations) {
if (citations == null || citations.length == 0) {
return 0;
}
int length = citations.length;
int[] nums2 = new int[length + 1];
for (int i = 0; i < citations.length; i++) {
if (citations[i] >= length) {
nums2[length]++;
} else {
nums2[citations[i]]++;
}
}
int t = 0;
for (int i = length; i >= 0; i--) {
t = t + nums2[i];
if (t >= i) {
return i;
}
}
return 0;
}
这个算法的空间复杂度是O(n),时间复杂度是O(n)。
总结
这个是类似的数据统计的题目。首先需要将数据进行处理为便于统计的数据,然后从数据中找出我们想要的数。